whatwg / html

HTML Standard
https://html.spec.whatwg.org/multipage/
Other
8.13k stars 2.67k forks source link

Provide a mechanism to trigger the fetch of compression dictionaries #10162

Open pmeenan opened 8 months ago

pmeenan commented 8 months ago

What problem are you trying to solve?

Compression dictionary transport provides a mechanism for using fetched resources as a compression dictionary for content-encoding of HTTP responses. The IETF draft for the http-level negotiation and compression is here.

For static resources (e.g. scripts), using a previous version of the resource that is in cache as a dictionary for a newer version can happen naturally as the scripts/stylesheets/whatever are fetched and used.

For the case where a dictionary is a separate, stand-alone dictionary (e.g. for compressing HTML), there needs to be a mechanism to trigger the fetch of the dictionary. Preferably one that doesn't have side-effects for browsers that don't support compression dictionaries and to allow for the browser to fetch the dictionary at an idle time.

What solutions exist today?

Anything that causes a resource to be fetched can be used to cause the dictionary load. This includes preload, prefetch as well as explicit fetch through javascript.

How would you solve it?

In the current origin trial that is running for the feature in Chrome and in the explainer we are using <link rel=dictionary href=...> for this purpose. This way, browsers that don't understand it just ignore the link tag and don't trigger the fetch.

Anything else?

No response

annevk commented 8 months ago

I think rel=dictionary is not specific enough. If I were to come across that I would expect that to provide a glossary for the page I'm looking at. rel=compression-dictionary might be reasonable though.

pmeenan commented 8 months ago

Sounds good to me. I wasn't entirely comfortable with dictionary in a document scope referring to compression. The only downside to compression-dictionary is the length of the string in HTTP headers but with modern protocols you'd only pay that price once for a given connection.

annevk commented 8 months ago

There might well be a shorter token that's also reasonable, although looking at the draft specification nothing jumps out. Maybe @mnot has a good idea.

mnot commented 8 months ago

comp-dict maybe?

martinthomson commented 7 months ago

This is tied in with the Use-As-Dictionary field and the matching functionality that is associated with that. I'd like to see this proven out for that feature before committing changes to link relation types that might be hard to unwind.

pmeenan commented 7 months ago

@martinthomson which part of the feature are you looking to see proven out and what might that look like?

The need to side-load a stand-alone dictionary (independent of the matching mechanism) goes back to the SDCH (and recent ZSDCH) use case of compressing dynamic resources. We already have data from the first round of Chrome's dictionary field trial that the use case works as expected for that dynamic case.

The open questions would come down to:

  1. Should a network response be able to specify the requests the dictionary should be available for (Use-As-Dictionary matching) or should there be some other way to specify it?
  2. How should the download and configuration for a stand-alone dictionary be triggered?

For the how side of things, being able to trigger it from headers and markup has some deployment value though restricting it to a header-based mechanism would be OK (only available in markup would be a problem). It behaves a lot like a prefetch/preload by seeding the response in the cache but not actively processing the response as part of the document.

Link relation felt like it had a good compatibility story where it would be ignored by browsers that didn't support the feature. The main risk would be in burning a relation type name that would be better used for something else in the future (like dictionary).

martinthomson commented 7 months ago

Having been involved with server push and a number of other features that were good on paper, I'm extremely cautious about claims of advantage for anything in this approximate area. Well documented experiments with robust experimental design is what I'm looking for. Including evidence that the operational costs are manageable and that the performance degradation resulting from operational neglect is tolerable. That sort of thing.

All of that being said, why are you not defining a link relation in the IETF documents? The HTML-specific processing here seems pretty limited, but the general applicability of the mechanism potentially extends beyond the web.

annevk commented 7 months ago

It definitely needs some HTML integration to ensure fetching is set up properly.

pmeenan commented 6 months ago

PR with the IETF-side changes is up for review/discussion: https://github.com/httpwg/http-extensions/pull/2783

Currently using compression-dictionary for the link relation even though it's a bit on the wordy side. In the case of HTTP, multiple responses referencing the same dictionary will compress away with HPACK/QPACK so it's not quite as bad as it may seem.