WICG / compression-dictionary-transport

Other
92 stars 8 forks source link

No support for hash-based versioning #10

Closed YoniFeng closed 1 year ago

YoniFeng commented 1 year ago

This somewhat falls under "open question 2" in the explainer, but I thought its worth opening an issue to discuss this specific aspect.

The problem It is very common for asset paths to include a hash of the asset's content.

The benefit: assets with no changes between version n to n+1 keep the same hash, and load from the HTTP cache.

The proposed path/scoping rules mentioned in the explainer do not support type of versioning. I think this is bad:

  1. If only myscript.js/{version}-esque scoping rules are supported, then there's a mutually exclusive choice between cache-friendly hash-based versioning, and support for delta dictionaries. The tradeoff will be:
    • If you stick to hash-based versioning, you get "peak performance"[1] (load from disk) for cached assets and pay the full(compressed) price for assets that changed. This is the boat we're all in today because it's the only option.
    • If you move to number-based versioning, you'll always have to fetch from the network but it'll be minimal deltas.

Of course, there's is no clear cut "one is better than the other". Factors such as code-splitting granularity, deployment cadence, and user demographic/perf distribution come to top of mind.

  1. On a meta-level (i.e. from ya'll browser folks' point of view), it's possible that the aggregate performance impact will be net negative. Being an "opt-in" feature where devs will chose delta dictionaries over hash-based versioning doesn't guarantee a net benefit in the long run. (They might A/B test today, but circumstances change after 1/2 a year).

I think Chromium might not be able to accurately measure the net impact even with an open A/B test origin trial due to selection bias of those who opt-in for the trial. For a non-trial, there are many other factors that could affect performance over time. Looking at improved CWV for a short window of "before/after" might tell a lie in the long run and we'd never know.

Solution thoughts I'm only here to complain..

Adding a wildcard anywhere in the path is a problem for the proposed scoping/pathing rules. Is it ~better if it's only allowed for the slug (last segment)? Maybe the slug can be a prefix by-definition? i.e. /myscript.js implicitly matches both /myscript.js.hash1, /myscript.js.hash2.

[1] - https://simonhearne.com/2020/network-faster-than-cache/

pmeenan commented 1 year ago

The current plan is for the scope to be a prefix so that it will include any appended hashes or versions. The case where it gets complicated is where the version/build/hash is in the middle of the path.

i.e. cdn.mysite.com/assets/myscript_HASH.js could set a scope of /assets/myscript since it is in the same directory depth and any hash or extensions after that would be covered.

YoniFeng commented 1 year ago

Great, worth mentioning in the explainer I think / add as an example. (I can draft up something if you'd like)