Closed dominiccooney closed 3 years ago
cc @jakearchibald @whatwg/html-parser
One big question is when this API exposes the tree it is operating on.
I'd like this API to support progressive rendering, so I guess I guess my preference is "as soon as possible".
const streamingFragment = document.createStreamingFragment();
const response = await fetch(url);
response.body
.pipeThrough(new TextDecoder())
.pipeTo(streamingFragment.writable);
document.body.append(streamingFragment);
I'd like the above to progressively render. The parsing would follow the "in template", although we may want options to handle other cases, like SVG.
One minor question is what to do with errors
What kinds of errors?
There are a few libraries that use tagged template literals to build HTML, I think their code would be simpler if they knew what state the parser was in at a given point. This might be an opportunity.
Eg:
const fragment = whatever`
<p>${someContent}</p>
<img src=${someImgSrc}>
`;
These libraries allow someContent
to be text, an element, a promise for text/element. someImgSrc
would be text in this case, but may be a function if it's assigning to an event listener. Right now these libraries insert a UID, then crawl the created elements for those UIDs so they can perform the interpolation.
I wonder if something like streamingFragment
could provide enough details to avoid the UID hack.
const streamingFragment = document.createStreamingFragment();
const writer = streamingFragment.writer.getWriter();
await writer.write('<p>');
let parserState = await streamingFragment.getParserState();
parserState.currentNode; // paragraph
await writer.write('</p><img src=');
parserState = await streamingFragment.getParserState();
…I guess this last bit is more complicated, but ideally it should know it's in the "before attribute value" state for "src" within tag "img". Ideally there should be a way to get the resulting attribute & element as a promise.
+@justinfagnani @webreflection
@dominiccooney HTML can have conformance errors, but there are recovery mechanisms for all of them and user agents doesn't bail out on errors. So any input can be consumed by the HTML parser without a problem.
I like @jakearchibald's API. However, I wonder if we need to support full document streaming parser and how API will look like for it. Also, in streaming fragment approach will it be possible to perform consequent writes to fragment (e.g. pipe one response to fragment and afterwards another one). If so, how it will behave: overwrite content of fragment or insert it in the end of fragment?
@jakearchibald
I think their code would be simpler if they knew what state the parser was in at a given point.
What do you mean by state here? Parser insertion mode, tokeniser state or something else?
@inikulin
I wonder if we need to support full document streaming parser
Hmm yeah. I'm not sure what the best pattern is to use for that.
will it be possible to perform consequent writes to fragment (e.g. pipe one response to fragment and afterwards another one). If so, how it will behave: overwrite content of fragment or insert it in the end of fragment?
Yeah, you can do this with streams. Either with individual writes, or piping with {preventClose: true}
. This will follow the same rules as if you mess with elements' content during initial page load.
As in, if the parser eats:
<p>Hello
…then you:
document.querySelector('p').append(', how are you today?');
…you get:
<p>Hello, how are you today?
…if the parser then receives " everyone"
, I believe you get:
<p>Hello everyone, how are you today?
…as the parser as a pointer to the first text node of the paragraph.
@jakearchibald There is a problem with this approach. Consider we have two streams: one writes <div>Hey
and the other one ya
. Usually when parser encounters end of the stream it finalises the AST and, therefore, the result of feeding the first stream to the parser will be <div>Hey</div>
(parser will emit implied end tag here). So, when second stream will write ya
you'll get <div>Hey</div>ya
as a result. So it will be pretty much the same as creating second fragment and appending it to the first one. On the other hand we can have API that will explicitly say parser to treat second stream as a continuation of the first one.
Thanks @jakearchibald for thinking of us.
I can speak for my 6+ months on the template literals VS DOM pattern so that maybe you can have as many info as possible about implementations/proposals/APIs etc.
I'll try to split this post in topics.
I am not using just a UID, I'm using a comment that contains some UID.
// dumb example
function tag(statics, ...interpolations) {
const out = [statics[0]];
for (let i = 1; i < statics.length; i++)
out.push('<!-- MY UID -->', statics[i]);
return out.join('');
}
tag`<p>a ${'b'} c</p>`;
This gives me the ability to let the HTML parser split for me text content in chunks, and verify that if the nodeType
of the <p>
childNodes[x]
is Node.COMMENT_NODE
and its textContent
is my UID, I'm fine.
The reason I'm using comments, beside letting the browser do the splitting job for me, is that browsers that don't support in core HTMLTemplateElement
will discard partial tables, cols, or options layout but they wouldn't with comments.
var brokenWorkAround = document.createElement('div');
brokenWorkAround.innerHTML = '<td>goodbye TD</td>';
brokenWorkAround.childNodes; // [#text]
brokenWorkAround.outerHTML;
// <div>goodbye TD</div>
You can read about this issue in all the polyfill from webcomponents issues. https://github.com/webcomponents/template/issues
As summary, if every browser was natively compatible with the template
element and the fact it doesn't ignore any kind of node, the only thing parsers like mine would need is a way to understand when the HTML engine encounters a "special node", in my case represented by a comment with a special content.
Right now we all need to traverse the whole tree after creating it, and in search of special placeholders.
This is fast enough as a one-off operation, and thanks gosh template literals are unique so it's easy to perform the traversing only once, but it wouldn't scale on huge documents, specially now that I've learned for browsers, and due legacy, simply checking nodeType
is a hell of a performance nightmare!
Now that I've explained the basics for the content, let's talk about attributes.
If you inject a comment as attribute and there are no quotes around, the layout is destroyed.
<nope nopity=<!-- nope -->>nayh</nope>
So, for attributes, having a similar mechanism to define a unique entity/value to be notify about woul dbe ACE!!!! Right now the content is injected sanitized upfront. It works darn well but it's not ideal as a solution.
If you put a placeholder in attributes you have the following possible issues:
style
, and the content does not contain colons (even if it's unvalid). _some: uid;
works, shena-nigans
wouldn't.<img src=uid>
would throw an error about the resource without even bothering the network (which has a smarter layer). This is Firefox<rect x=uid y=uid />
, before you'll set the right values it will show an error that x or y were not valid.HTML is very forgiving in many parts, attributes are quite the opposite for various scenarios.
As summary if whatever mechanism would tell the browser any attribute with such special content should be ignored, all these problems would disappear.
As much as I'd love to have help from the platform itslef regarding the template literals pattern, I'm afraid that won't ever land in production until all browsers out there would support it (or there is a reliable polyfill for that).
That means that exposing the internal HTML parser through a new API can surely benefits projects from the future, but it would rarely land for all browser in 5+ years.
This last point is just my consideration about effort / results ratio.
Thanks again for helping out regardless.
@inikulin
There is a problem with this approach
I don't think it's a problem. If you use {preventClose: true}
, it doesn't encounter "end of stream". So:
await textStream1.pipeTo(streamingFragment.writable, { preventClose: true });
await textStream2.pipeTo(streamingFragment.writable);
The streaming fragment would consume the streams as if there were a single stream concatenated.
await textStream3.pipeTo(streamingFragment.writable);
The above would fail, as the writable has now closed.
P.S. just in case my wishes come true ... what both me and (most-likely) Justin would love to have natively exposed, is a document.queryRawContent(UID)
that would return, in linear order, atributes with such value, or comments nodes with such value.
<html lang=UID>
<body> Hello <!--UID-->! <p class=UID></p></body>
The JS coutner part would be:
const result = document.queryRawContent(UID);
[
the html lang attribute,
the comment childNodes[1] of the body,
the p class arttribute
]
Now that, in core, would make my parser a no brainer (beside the issue with comments and attributes, but RegExp upfront are very good at that and blazing fast
[edit] even while streaming it would work, actually it'd be even better so it's one pass for the browser
Also since I know for many code is better than thousand words, this is the TL;DR version of what hyperHTML does.
function tag(statics, ...interpolations) {
if (this.statics !== statics) {
this.statics = statics;
this.updates = parse.call(this, statics, '<!--WUT-->');
}
this.updates(interpolations);
}
function parse(statics, lookFor) {
const updates = [];
this.innerHTML = statics.join(lookFor);
traverse(this, updates, lookFor);
const update = (value, i) => updates[i](value);
return interpolations => interpolations.forEach(update);
}
function traverse(node, updates, lookFor) {
switch (node.nodeType) {
case Node.ELEMENT_NODE:
updates.forEach.call(node.attributes, attr => {
if (attr.value === lookFor)
updates.push(v => attr.value = v)});
updates.forEach.call(node.childNodes,
node => traverse(node, updates, lookFor)); break;
case Node.COMMENT_NODE:
if (`<!--${node.textContent}-->` === lookFor) {
const text = node.ownerDocument.createTextNode('');
node.parentNode.replaceChild(text, node);
updates.push(value => text.textContent = value);
}}}
const body = tag.bind(document.body);
setInterval(() => {
body`
<div class="${'my-class'}">
<p> It's ${(new Date).toLocaleTimeString()} </p>
</div>`;
}, 1000);
The slow path is the traverse
function, the not-so-cool part is the innerHTML injection (as regular node, template or whatever it is) without having the ability to intercept, while parsing the string, all placeholders / attributes and act addressing them accordingly.
OK, I'll let you discuss the rest now :smile:
@WebReflection
I think the UID scanner you're talking about might not be necessary. Consider:
const fragment = whatever`
<p>${someContent}</p>
<img src=${someImgSrc}>
`;
Where whatever
could do something like this:
async function whatever(strings, ...values) {
const streamingFragment = document.createStreamingFragment();
const writer = streamingFragment.writer.getWriter();
for (const str of strings) {
// str is:
// <p>
// </p> <img src=
// >
// (with extra whitespace of course)
await writer.write(str);
let parserState = streamingFragment.getParserState();
if (parserState.tokenState == 'data') {
// This is the case for <p>, and >
await writer.write('<!-- -->');
parserState.currentTarget.lastChild; // this is the comment you just created.
// Swap it out for the interpolated value
}
else if (parserState.tokenState.includes('attr-value')) {
// await the creation of this attr node
parserState.attrNode.then(attr => {
// Add the interpolated value, or remove it and add an event listener instead etc etc.
});
}
}
}
Yes, that might work. As long as these scenarios are allowed:
const fragment = whatever`
<ul>${...}</ul>
${...}
<p data-a=${....} onclick=${....}>also ${...} and</p>
<img a=${...} b=${...} src=${someImgSrc}>
<table><tr>${...}</tr></table>
`;
which looks like it'd be the case.
@WebReflection Interpolation should be allowed anywhere.
whatever`
<${'img'} src="hi">
`;
In the above case tokenState
would be "tag-open"
or similar. At this point you could either throw a helpful error, or just pass the interpolated value through.
@jakearchibald Do you expect tokenState
to be one of tokeniser states defined in https://html.spec.whatwg.org/multipage/parsing.html#tokenization? If so, I'm afraid we can't do that, they are part of parser intrinsics and are subject to change. Moreover, some of them can be meaningless for a user.
@inikulin yeah, that's what I was hoping to expose, or something equivalent. Why can't we expose it?
@jakearchibald
what about the following ?
whatever`
<${'button'} ${'disabled'}>
`;
I actually don't mind having that possible because boolean attributes need boolean values so that ${obj.disabled ? 'disabled' : ''}
doesn't look like a great option to me, but I'd be curious to know if "attribute-name"
would be exposed too.
Anyway, having my example covered would be already awesome.
@WebReflection The tokeniser calls that the "Before attribute name state", so if we could expose that, it'd be possible.
Not sure this is actually just extra noise or something valuable, but if it can simplify anything, viperHTML uses similar mechanism to parse once on the NodeJS side.
The parser is the pretty awesome htmlparser2.
Probably inspiring as API ? I use the comment trick there though, but since there is a .write
mechanism, I believe it could be possible to make it incremental.
@jakearchibald These states are part of intrinsic parser mechanism and are subject of change, we've even removed/introduced few recently just to fix some conformance-error related bug in parser. So, exposing them to end user will require us to freeze current list of states, that will significantly complicate further development of the parser spec. Moreover, I believe some of them will be quite confusing for end users, e.g. Comment less-than sign bang dash dash state
@inikulin would a subset be reasonable? As example, data
and attr-value
for me would cover already 100% of hyperHTML use cases and I believe those two will never change in the history of HTML ... right?
I'm keen on exposing some parser state to help libraries, but I'm happy for us to add it later rather than block streaming parsing on it.
@WebReflection Yes, that could be a solution. But I have some use cases in mind that can be confusing for end user. Consider <div data-foo="bar"
. We'll emit attr-value
state in that case, however this markup will not produce attribute in AST (it will not even produce a tag, since unclosed tags in the end of the input stream are omitted).
@inikulin if someone writes broken html I don't expect anything different than throwing errors and break everything right away (when using a new parser API)
Template literals are static, there's no way one of them would instantly fail the parser ... it either work or fail forever, since these are also frozen Arrays.
Accordingly, I understand this API is not necessarily for template literals only, but if the streamer goes bananas due wrong output it's developer fault.
today it's developer fault regardless, but she'll never notice due silent failure.
if someone writes broken html I don't expect anything different than throwing errors and break everything right away.
You will be surprised looking at the real world markup around the web. Also, there is no such thing as "broken markup" anymore. There is non-conforming markup, but modern HTML parser can swallow anything. So, to conclude, you suggest to bail out with an error in case of invalid markup in this new streaming API?
You will be surprised looking at the real world markup around the web.
you missed the edit: when using a new parser API
So, to conclude, you suggest to bail out with an error in case of invalid markup in this new streaming API?
If the alternative is to not have it, yes please.
I'm tired of missed opportunities due lazy developers that need to be cuddle by standards for their mistakes.
If the alternative is to not have it, yes please.
I'm tired of missed opportunities due lazy developers that need to be cuddle by standards about their mistakes.
I'm not keen to this approach to be honest, it brings us back to times of XHTML. One of the advantages of HTML5 was it's flexibility regarding parse errors and, hence, document authoring.
this API goal is different, and developers want to know if they wrote a broken template.
Not knowing it hurts themselves, and since there is no html highlight by default inside strings, it's also a safety belt for them.
So throw like any asynchronous operation that failed would throw, and let them decide if they want to fallback to innerHTML or fix that template literal instead, and forever.
To be more explicit, nobody on earth would write the following or, if they do by accident, nobody wants that to succeed.
template`<div data-foo="bar"`;
so why is that a concern?
On JavaScript, something similar would be a Syntax error and it will break everything.
@inikulin
Consider
<div data-foo="bar"
. We'll emitattr-value
state in that case, however this markup will not produce attribute in AST (it will not even produce a tag, since unclosed tags in the end of the input stream are omitted).
FWIW this would be fine in an API like my example above. The promise that returns the currently-in-progress element/attribute would reject in this case, but the stream would still write successfully.
I agree that a radically different parsing style would be bad. I'd prefer it to be closer to the regular document parser than innerHTML
.
I agree that a radically different parsing style would be bad. I'd prefer it to be closer to the regular document parser than innerHTML.
@jakearchibald They are pretty much the same, with exception that for innerHTML
parser performs adjustment of it's state according to context element before parsing.
@inikulin innerHTML
behaves differently regarding script elements. I hope we could avoid those differences with this API.
@WebReflection
this API goal is different, and developers want to know if they wrote a broken template. To be more explicit, nobody on earth would write the following, and nobody wants that to succeed.
This would be somehow true if templates will be the only use case for this API. What if I want to fetch some arbitrary content provided by 3rd party? E.g. user-supplied comments or something else?
What if I want to fetch some arbitrary content provided by 3rd party? E.g. user-supplied comments or something else?
what about it? you'll never have the partial content, just whole content. Or you are thinking about an evaluation at runtime of some user content that put some ${value}
inside the comment?
In latter case, I don't see a realistic scenario. In "just parse-stream it all" case I don't see any issue, you'll never have a token in the first place.
Anyway, if it's about having missed notifications due silent failures and internal adjustments I'm also OK with it. It'll punish heavily developers that don't test their template, and I'm fine with that too.
@WebReflection To be clear, we are not talking about partial content only. There are many other cases there you can get non-conforming markup.
@inikulin I honestly see your argument like fetch(randomThing).then(b => b.text()).then(eval)
, which I fail to see as ever desired use-case.
But like I've said, I wouldn't care if the silent failure/adjustment happens. I'm fine for the parser to never break, it'll be somebody else problem, as long as the parse can exist, exposing what it can, when it can, which will be 99% of the desired use cases to me.
Is this possible? Or this is a won't fix/won't implement ?
This is the bit I'm not sure I understand from your answers. I read potential limits, but not proposed alternatives / solutions.
To be clear, we're not interested in introducing a new, third parser (besides the HTML and XML ones) that only accepts conforming content.
XML already accepts conforming content, and I believe this parser would need to be compatible with SVG too.
However, like I've said, it works for me either ways.
TL;DR can this parser expose data
and attr-value
tokens/states whenever these are valid?
If so, great, that solves everything.
All other cases are (IMO) irrelevant, but not having it because of possible lack of tokens in broken layout would be a hugely missed opportunity for the Web.
I hope I've also made my point of view clear.
Here are some requirements which I think sums up what's been discussed so far:
createContextualFragment
does it).@jakearchibald BTW, regarding script execution: maybe we can make it optional? For example if I parse HTML from some untrusted source it would be nice to be able to prevent execution for parsed fragment.
@inikulin I fear that may be false security. Although innerHTML
doesn't download/execute script elements, it doesn't block attributes that are later executed (eg onclick
attributes).
Seems safer to defer to existing methods that control script download & execution, like CSP and sandbox.
@jakearchibald Thinking of it bit more I wonder how fragment approach suppose to work considering that when you append fragment into node it's children are adopted by new parent node: https://dom.spec.whatwg.org/#concept-node-insert. So if we insert fragment while content is still piped into it, how we should behave? Make parent node a receiver of all consequent HTML content? In that case we'll need a machinery to pipe HTML content into element. In that regard, it will make more sense to implement streaming parser API for elements and document fragments without introducing new node type (something like element.writable
and fragment.writable
).
@inikulin In terms of adopting, how does https://jakearchibald.com/2016/fun-hacks-faster-content/#using-iframes-and-documentwrite-to-improve-performance work?
I don't like element.writable
as it doesn't really fit with how writables can only be written to once. That's how I ended up with a special streaming fragment. It may be the same node type as a regular fragment though.
Hmm, it a bit confusing that fragment becomes some kind of proxy entity to pipe HTML in element considering that new nodes will not appear in fragment. But maybe it's just my perception...
They'll appear in the fragment until the fragment is appended to an element.
It's no stranger than https://jakearchibald.com/2016/fun-hacks-faster-content/#using-iframes-and-documentwrite-to-improve-performance, but I guess that's pretty strange.
I very much share the goal of being able to streaming-parse HTML without blocking the main thread.
This goal is pretty connected to some of the goals I had in the DOMChangeList proposal (specifically the DOMTreeConstruction
part of that proposal). Here's a sketch of we could enhance that proposal to support these goals:
DOMTreeConstruction
operations. Since DOMTreeConstruction
is already a binary format, a stream of the binary blob might be sufficient.insertTreeBefore
that would take a stream of DOMTreeConstruction
and append it into the DOM.DOMTreeConstruction
is already intended to provide a low-level API that can be used in a worker and transferred from a worker to the UI thread (without having to deal with the thorny questions of making a transferrable DOM available in workers). That makes it a nice fit for async parsing and possibly even streaming parsing.
This thread is really a missing piece of the other proposal: DOMChangeList
provides a way to go from operations to actual DOM, but it doesn't provide a compliant way for going from HTML to operations. If we added a way from going from HTML to operations, we can break up the entire processing pipeline and do arbitrary parts of the process in workers (anything up to putting the operations in the real DOM).
As an unrelated aside, I would find it very helpful to have an API that provided a stream of tokenizer events that could be intercepted on the way to the parser. That would allow Glimmer to implement the {{
extension in user-space ({{
text isn't legal in all of the places where you would want it to be meaningful, and has different meaning in text positions vs. attributes). Today, we are forced to require a bundler for HTML, but I would love to be able to use more of the browser's infrastructure instead.
@domenic said:
To be clear, we're not interested in introducing a new, third parser (besides the HTML and XML ones) that only accepts conforming content.
Doesn't the existing HTML parser spec specifically describe a mode that aborts with an exception on the first error?
For non-streaming content, it would probably be sufficient just to expose whether an error had occurred at all (and then userspace could throw away the tree). For streaming content, it might also be sufficient (userspace could "roll back" by deleting any nodes that were already inserted?)
Wow, long thread is long. I had a busy morning, so I'll try to hit two points I caught just now.
Async API: This would make it difficult to use this API in many scenarios. Right now when you create and attach an element, you may expect that the element has rendered synchronously. With an async parser API, if the element has to parse it's template to render, that breaks. In essence, using an async parser API would be similar to using <template>
today, but with asyncAppend instead of append. Lots of cod would get more complex as element state itself becomes async and we don't have a standard way of waiting for an element to be "ready".
Of course, if we had top-level await, we could hide that async API behind module initialization.
Being able to get parser state while parsing fragments would be awesome, but in order to avoid inserting sentinels altogether, we'd need a few more features, like
<div>abc
then def</div>
, we'd need a way to get a reference to abc
and def
and not collapse them into a single node.But stepping back, the real API I want is to be able to create a tree of DOM and easily get references to particular nodes in cheaply. https://github.com/whatwg/html/issues/2254 (at least the Template Parts" idea in there) would solve my use-case completely.
Another thing that would help is a variation on the TreeWalker API that didn't return a node from nextNode()
so that I could navigate to a node without creating wrappers for all preceding nodes.
@justinfagnani I think @jakearchibald already solved your i and ii points; https://github.com/whatwg/html/issues/2993#issuecomment-326552132
you can write a comment and retrieve it right away as your place holder so that you'd have abc
then your content, and later on whatever it is, including def
and eventually another data
where you can add another comment
`<div>a ${'b'} c ${'but also d'} e</div>`
TL;DR HTML should provide an API for parsing. Why? "Textual" HTML is a widely used syntax. HTML parsing is complex enough to want to use the browser's parser, plus browsers can do implementation tricks with how they create elements, etc.
Unfortunately the way the HTML parser is exposed in the web platform is a hodge-podge. Streaming parsing is only available for main document loads; other things rely on strings which put pressure on memory. innerHTML is synchronous and could cause jank for large documents (although I would like to see data on this because it is pretty fast.)
Here are some strawman requirements:
Commentary:
One big question is when this API exposes the tree it is operating on. Main document parsing does expose the tree and handles mutations to it pretty happily; innerHTML parsing does not until the nodes are adopted into the target node's document (which may start running custom element stuff.)
One minor question is what to do with errors.
Being asynchronous has implications for documents and/or custom elements. If you allow creating stuff in the main document, then you have to run the custom element constructors sometime, so to make it not jank you probably can't run them together. This is probably a feature worth addressing.
See also:
Issue 2827