WICG / compression-dictionary-transport

Other
92 stars 8 forks source link

Provide mechanism for A/B testing #54

Open pmeenan opened 4 months ago

pmeenan commented 4 months ago

One of the things that came up during Chrome's origin trial is that A/B testing the effectiveness of compression dictionaries is difficult (and will become more difficult when it is no longer an origin trial).

There are 2 points in the serving flow where dictionary decisions need to be made:

  1. On the original request when the use-as-dictionary response header is sent to mark a response as an available dictionary.
  2. On a subsequent request when the client advertises available-dictionary and the server decides if it is going to serve a dictionary-compressed response.

In the case of the origin trial, there is a third gate which is the setting of the origin trial token which enables the feature (without which the use-as-dictionary response header will be ignored. Outside of the origin trial there is no page-level gate for enabling it and in both cases, once enabled, there is no way to turn it off for individual users.

For the dynamic use case where the server is running application logic anyway and the response is not coming from a cache, it is possible to use a cookie or some other mechanism to decide if dictionaries should be used, both on the initial request and subsequent requests where the available-dictionary request can just be ignored.

In the static file use case where resources are served from an edge cache and the cache keys the resources by URL, accept-encoding and available-dictionary, there is no granular way to control user populations. All clients for a resource will get the use-as-dictionary response header and all clients that advertise a given dictionary would get the dictionary-compressed response. The page does have SOME level of control but it would require using different URLs for the resources for the different populations.

Counter-points

While it would be useful for sites to be able to have granular control over the feature for measuring the effectiveness during roll-out, that level of control is not usually exposed for transport-level features.

  1. Other content encodings have the same restrictions, including brotli and ZStandard as they were rolled out.
  2. As mentioned above, it is difficult but not impossible to test by using different URLs for different populations (though this is more difficult if you don't control the page where the URLs are embedded).
  3. Allowing for a global enable/disable capability would potentially expose 1 bit of fingerprinting data across privacy boundaries.
  4. This is only for A/B testing, at a global level there are already controls that allow for the feature to not be used in case of a catastrophic problem (either by browser flags for the browser manufacturer to disable or by ignoring the available-dictionary request headers).