tc39 / proposal-binary-ast

Binary AST proposal for ECMAScript
965 stars 23 forks source link

Consider specifying how browsers tell servers they accept AST files #5

Open aickin opened 7 years ago

aickin commented 7 years ago

Thanks for such a cool proposal!

Continuing on from this thread, I wanted to suggest that there should be some sort of structured way for browsers to tell servers whether or not the browser understands Binary AST files, and, if supported, which version(s) are understood.

I think the most likely candidate for implementation is the HTTP Accept header, although it gets a bit complicated when combined with features like server push or inlined scripts.

If browsers don't send an Accept header or something like it, servers will have to use user-agent sniffing to figure out whether to send Binary AST or JavaScript. In my experience with sending different versions of JS to browsers, this is a pretty cumbersome and error-prone solution.

Thanks again for this!

Qix- commented 7 years ago

Why would Accept headers affect server push? Just curious. Accept can be choosy with things:

Accept: application/x-javascript-ast;q=0.9, application/javascript;q=0.8, text/javascript;q=0.5

The above would favor ASTs over application/javascript over text/javascript.

aickin commented 7 years ago

@Qix- Great question; I haven't thought about it deeply, but I'll sketch out my rough ideas.

So, there are a couple of ways to use Accept. The simplest way would be to just add a new MIME type (let's use your suggestion application/x-javascript-ast) on all HTTP requests for scripts. A regular use of binary AST files would then look like this:

  1. User types in www.sitewithastscripts.com and hits return.
  2. Browser requests the HTML page with Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8.
  3. Server returns a page with <script src="/some_script"></script> in it.
  4. Browser requests /some_script with Accept: application/x-javascript-ast,*/*.
  5. Server sees that the browser supports AST files and sends an AST version of some_script. If it hadn't seen that Accept header, it would have sent a JS version of some_script.

This totally works! But there are a few downsides:

  1. It requires varying the content based on the Accept header, which may be difficult to support on all CDNs.
  2. If you want to server push the script, the server would need to do so during step 3 before it receives the Accept header for the script, so the server wouldn't know whether or not the browser supports AST files.
  3. If you want to inline the script in a data URL, you have the same problem as server push; the server hasn't seen the script Accept header yet.

So basically, to cover these use cases, I think the Accept header would need to specify AST file support in the original HTML request. However, that seems a little unusual, and there may be good reasons why it's not done right now. That's about the limit of my knowledge.

Qix- commented 7 years ago

@aickin Definitely, that illustrates the problem really nicely.

How do servers handle this with pre-compressed assets? E.g. a server that stores both a brotli and a gzip compressed asset and serves it based on the Accept-Encoding header? Doing brotli compression on the fly is slow and taxing, so many servers would like to pre-compress those assets since disk space is cheap, right?

Also, correct me if I'm wrong but an agent doesn't have to support encoding (compression) on all requests, and you can't be certain they do for all subsequent requests (e.g. the agent might support brotli on the initial HTML, but not on images). I get that this is a rare case but it's still standards-compliant (again, correct me if I'm wrong).

So that's a case where server push would have to predict what the browser is going to accept and just correct itself if it doesn't, right? In which case we could apply the same logic with ASTs and the Accept header - assume the browser does support ASTs (or doesn't, it's up to the implementation/configuration of the server) and push those first, and correct itself if the agent doesn't support it. Otherwise, the agent just disregards the push according to HTTP/2 IIRC.

This approach means that AST + Server Push only gets more optimized as adoption increases, which to me seems acceptable.

I also wonder how the folks over at WebASM are handling this, since I'm sure a broad use-case is writing some compiled code with a javascript fallback - and if this proposal catches on, I would see people writing a wide matrix of compiled assets (i.e. [webasm, js-ast, js] x [uncompressed, gzip, brotli] x [unminified, minified]).

aickin commented 7 years ago

Great point that Accept-Encoding has a similar set of issues!

an agent doesn't have to support encoding (compression) on all requests, and you can't be certain they do for all subsequent requests (e.g. the agent might support brotli on the initial HTML, but not on images)... it's still standards-compliant (again, correct me if I'm wrong).

I think that's correct, but I believe that in practice, unlike with Accept, all major browsers currently send the same Accept-Encoding header on all requests. So when you server push a compressed sub-resource, you push down the Accept-Encoding request header that you received on the first request. If the browser decides for some reason to change to a different Accept-Encoding for the sub-resource (which, again, I don't think any browsers do), then it will just ignore the server push and make a new request with its new Accept-Encoding header.

In which case we could apply the same logic with ASTs and the Accept header - assume the browser does support ASTs (or doesn't, it's up to the implementation/configuration of the server) and push those first, and correct itself if the agent doesn't support it. Otherwise, the agent just disregards the push according to HTTP/2 IIRC.

Yep, you could use this same strategy for MIME type negotiation, but while it would be functionally correct, I think the perf profile for it would be pretty bad in practice. When the browser does support ASTs, you get a 90% speedup in parsing, but when it doesn't, you approximately double the amount of JavaScript sent over the wire and add a round trip. In practice, that'd be a hard tradeoff to accept, and I would suspect most engineers would fall back to user agent sniffing to avoid it.

It's also worth noting I don't think any of this helps with use case 3, when you want to inline script directly into the HTML.

I also wonder how the folks over at WebASM are handling this, since I'm sure a broad use-case is writing some compiled code with a javascript fallback

Interesting question, and I think you're right that choosing between WASM and a fallback asm.js implementation is probably relatively common for WASM sites. I'm pretty sure that WASM is always loaded from JavaScript on the client, so you'd have to do a runtime feature detect to decide whether you wanted to load WASM or JS. As a result, WASM can't be performantly server pushed or inlined without UA sniffing, I think. I'll keep my eyes peeled to see if anyone is solving it at the HTTP level.

vigneshshanmugam commented 7 years ago

I am also interested in how different JS AST format would give hints to preloading scripts. There are already two potential solutions discussed for preloading modules https://github.com/whatwg/fetch/issues/486, Not sure how WASM/AST proposal would go in for preloads.

kannanvijayan-zz commented 7 years ago

@aickin The specifics of the delivery mechanism still need to be banged out, so thanks for the question.

There are a few different cases to cover, and addressing them in the standard incrementally over time may be a path to follow. The context of script parsing can be one of:

  1. Script file loaded from Githubissues.
  2. Githubissues is a development platform for aggregating issues.