Open stroiman opened 1 week ago
In order to make a decision we'll need to see a reasonably complete description of the new API. That suggests that this should start out as a third-party package, perhaps based on x/net/html. Once that is working and useful we can see whether it makes sense to incorporate in the x/net repo.
Sounds reasonable. I did consider forking at some point, but adjust entirely to my needs.
But I think there's a really good case for separation; the current x/net/html can deal with the issues of malformed HTML, a non-trivial problem on its own; and keep that logic completely separate from the rules handling specific element insertion steps.
Proposal Details
The following is not something I need right now, but my use case made me think that there is a valid reason for the code calling
html.Parse
to be able to influence the parsing process.I'm playing around with a crazy idea, to create a headless browser written in Go (I have a POC that proves I can write the DOM in Go, and expose the objects to JavaScript - I have a v8 engine integrated).
I have used
x/net/html
to parse HTML; but there are some rules for constructing the DOM in the browser that the library doesn't support (and I don't think it should implement those rules - but let the caller deal with them).For example, when a
<script>
element is connected, the script is executed. Which means the script doesn't see the entire "HTML source" (or the DOM representation of the entire source to be exact). Scripts aren't the only types of element with these types of rules wither.I can work around that right now; but makes me want to ask the question, if support should be added to
x/net/html
that allows the caller to install hooks in the process. Events likeinserted
,connected
,removed
seems like candidates.As per the DOM tree specs
My current approach
The path I have decided to take forward is to first use the
x/net/html
package to construct a node tree, and then iterate that tree to construct my own tree. So basically I'll perform two passes of DOM processing, and I can implement the necessary rules in the 2nd pass.I will most likely use the
html.Node
instance as a "backing field" for DOM data, as it gives the benefits:element.outerHTML
/innerHTML
Even if I could hook into the parser process I still need to create wrapper types for various reasons. E.g., I need to expose a different interface to JavaScript, and I need a binding layer to JavaScript.
What benefits would the hooks give
If the package supported hooks you could generate the "correct DOM" in one pass.
In my case, the wrapper objects could then just be created lazily.
Alternate solution
The suggested hooks is just one possible solution to the problem;
Another potential solution could be to let the caller pass in some kind of "Node Factory". But that does seem more complex.
A bit background about the idea
The idea was basically to be able to test HTTP apps written in Go; but where client-side script plays an important role. The standard library provides everything necessary if the test only need to inspect the HTTP response, e.g. response headers or body. But if you want to verify that the behavior JavaScript, you need something more. An example could be using HTMX and Go, a tech-combo that seems to be gaining some popularity. Here, the behavior of the app depends on setting the right attributes on specific HTML elements, and be sure that the returned HTTP responses play well with those tags.
Using a real browser in headless has a huge overhead; which tends to discourage TDD; which otherwise provides a fast feedback loop.
Being in written in Go, it easily bypass the TCP layer entirely and connect directly to the root
net/http.Handler
. This makes it easy to let tests run in parallel with each test using with their own http handler; which could possibly have different dependencies mocked or stubbed for different test cases.The project is as (but this is an extremely early prototype) is in https://github.com/stroiman/go-dom