WICG / compression-dictionary-transport

Other
92 stars 8 forks source link

sec-fetch-dest for dictionary fetch #15

Closed horo-t closed 1 year ago

horo-t commented 1 year ago

In the current explainer, when the browser detects a link element <link rel=bikeshed-dictionary as=document href="/product/dictionary_v1.dat">, it fetches the dictionary with sec-fetch-dest: document header.

However, when the server receives the request, it may be confused whether this is an actual document request for navigation or a dictionary fetch request. Therefore, I want to recommend to introduce an appropriate sec-fetch-dest value to indicate that the request is for a dictionary fetch.

Two possible ideas are:

  1. sec-fetch-dest: dictionary and sec-fetch-dict-dest: document
  2. sec-fetch-dest: dictionary-for-document

In Chromium implementation, the "document" destination type is used to detect the main resource request. Therefore, introducing a new destination type is also convenient for Chromium developers.

yoavweiss commented 1 year ago

I believe the point of defining an as attribute is to be able to declare ahead of time which type of resource the dictionary would be applied to. That enables us to restrict the dictionary's application to the intended destination.

pmeenan commented 1 year ago

The goal is to make it fit seamlessly into the normal flow of "existing resources of the same destination can be used as dictionaries for future requests for the same destination" and just provide a way to seed resources to specific destinations for use in that case.

It should be functionally equivalent to fetch("/product/dictionary_v1.dat", {as:"document"}) or <link rel=preload as=document href=/product/dictionary_v1.dat> and just serve as a way to trigger a fetch that will be discarded and only be used to populate the cache.

Technically the dictionary contents will end up as part of future documents as well.

For Chromium developers, how are fetch() and preload() links for as=document treated? I'm guessing there may be some custom logic and rules that is different from other resource types but presumably they are also not treated as "the main document".

horo-t commented 1 year ago

As far as I know, the Fetch API doesn't support setting the destination of the Request object. Chromium is always setting empty destination in FetchManager::Fetch(). Also unfortunately Chromium doesn't support as=document for <link rel=preload>.

Chromium has several logical checks to identify if the destination is document or not, which is used to determine if the request is for a main resource. Using the document destination for the dictionary fetch could result in unintended bugs. Therefore, even if we do not send sec-fetch-dest: dictionary, it would be beneficial to introduce the dictionary destination internally in Chromium.

I believe the same issue could occur on the server side. For example, script's dictionary could potentially contain an executable script unexpectedly. In such cases, the server should not serve the dictionary if the request is actually coming from the script element. However, using sec-fetch-dest: script for dictionary fetch makes it impossible to detect this scenario.

yoavweiss commented 1 year ago

Chromium has several logical checks to identify if the destination is document or not, which is used to determine if the request is for a main resource. Using the document destination for the dictionary fetch could result in unintended bugs.

That's a good point.

I believe the same issue could occur on the server side. For example, script's dictionary could potentially contain an executable script unexpectedly. In such cases, the server should not serve the dictionary if the request is actually coming from the script element. However, using sec-fetch-dest: script for dictionary fetch makes it impossible to detect this scenario.

I'm not sure I understand that threat. Can you elaborate on it?

Regarding the destination match requirement, it's something I thought could limit the potential for wrongfully applying dictionaries, but I don't necessarily have a well-thought-out threat model that it protects against. Maybe we can remove this requirement, or potentially have dictionary destinations be applicable to all other destinations, if we want to keep it.

horo-t commented 1 year ago

I'm not sure I understand that threat. Can you elaborate on it?

For example, if the source scripts of the dictionary contain a while loop, the generated dictionary could be while(true) {}. If an attacker manages to inject <script src='dictionary.dat'>, the server will receive the script resource request and return the dictionary. The page will become inoperable due to the while loop. Using Content-Security-Policy: script-src 'self' cannot prevent this type of attack. Setting correct content-type can prevent this. So the concern about this vulnerability may be excessive.

Regarding the destination match requirement, it's something I thought could limit the potential for wrongfully applying dictionaries, but I don't necessarily have a well-thought-out threat model that it protects against. Maybe we can remove this requirement, or potentially have dictionary destinations be applicable to all other destinations, if we want to keep it.

Yes, I agree with you. I don't have any idea how the destination matching can prevent any potential threat models. +1 for removing the destination matching.

pmeenan commented 1 year ago

I'm on board with removing the requirement for browser-enforced destination matching.

It might still be useful to keep for the filtering/matching of dictionaries in case someone wants to have aa JS-specific and CSS-specific external dictionary for modules served from a given path.

For the static resource case, the implicit destination matching still makes sense, at least as an implicit default.

For the dynamic case (link fetch) it could default to * but allow for a specific destination in the use-as-dictionary options.

horo-t commented 1 year ago

It might still be useful to keep for the filtering/matching of dictionaries in case someone wants to have aa JS-specific and CSS-specific external dictionary for modules served from a given path.

I think that supporting multiple dictionaries may not be necessary. Instead, the site administrator can serve a single dictionary that combines both the JS-specific and CSS-specific dictionaries by concatenating them.

Also , there are various types of destinations for script files ("audioworklet", "paintworklet", "script", "serviceworker", "sharedworker", "worker") and for HTML files ("document", "frame", "iframe"). Managing multiple dictionaries for each destination can become a complicated task for site administrators.

I like the concept of having one dictionary for one path, because it is straightforward and uncomplicated.

pmeenan commented 1 year ago

SGTM.

Simpler is better than unnecessarily complicating things for a hypothetical edge case.

On Tue, Mar 14, 2023 at 7:13 PM Tsuyoshi Horo @.***> wrote:

It might still be useful to keep for the filtering/matching of dictionaries in case someone wants to have aa JS-specific and CSS-specific external dictionary for modules served from a given path.

I think that supporting multiple dictionaries may not be necessary. Instead, the site administrator can serve a single dictionary that combines both the JS-specific and CSS-specific dictionaries by concatenating them.

Also , there are various types of destinations for script files ("audioworklet", "paintworklet", "script", "serviceworker", "sharedworker", "worker") and for HTML files ("document", "frame", "iframe"). Managing multiple dictionaries for each destination can become a complicated task for site administrators.

I like the concept of having one dictionary for one path, because it is straightforward and uncomplicated.

— Reply to this email directly, view it on GitHub https://github.com/yoavweiss/compression-dictionary-transport/issues/15#issuecomment-1468991464, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADMOBOPEQ33DENIWU5VM2LW4D3SVANCNFSM6AAAAAAVYXED6A . You are receiving this because you commented.Message ID: @.*** com>