WICG / compression-dictionary-transport

Other
92 stars 8 forks source link

Exposing storage usage for dictionaries #35

Open horo-t opened 1 year ago

horo-t commented 1 year ago

I'm wondering whether we should expose the storage usage for the dictionaries.

Currently Storage API is providing a way to get the storage usage.

For example in Chromium,

JSON.stringify(await navigator.storage.estimate(), null, 2);

returns

{
  "quota": 296630877388,
  "usage": 75823910,
  "usageDetails": {
    "caches": 72813056,
    "indexedDB": 2877379,
    "serviceWorkerRegistrations": 133475
  }
}

Note: usageDetails was launched in Chromium. But it is still under spec discussion.

I have two questions:

  1. Is it OK to increase the usage for dictionaries?
  2. Is it OK to introduce dictionaries in usageDetails?

All dictionary resources should be readable from the page, so I don't think there is any risk of exposing them. But I'd love to hear other options.

yoavweiss commented 1 year ago

All dictionary resources should be readable from the page, so I don't think there is any risk of exposing them

Agree.

I'd love to hear opinions from folks more familiar with the Storage API than myself though. ^^ @annevk @domenic @wanderview @miketaylr

annevk commented 1 year ago

Storage is keyed (top-level-site, origin), but this is not, right? Or maybe that's not defined in sufficient detail, but the partitioning text at least suggests it has a wider reach. (And especially if it's tied to HTTP caches, which have an even wider reach in non-Chromium browsers at the moment, though maybe that should change. At least, I think Chromium shipped that.)

yoavweiss commented 1 year ago

The explainer does mention that dictionaries will be triple keyed.

annevk commented 1 year ago

I know, but not by embedder origin, but site. (And in most implementations the HTTP cache is not keyed on embedder anything at the moment.)

yoavweiss commented 1 year ago

I believe the intention here is to key this in the same way that the browser's cache is keyed (so "nested context site" should be replaced by "nested context origin" in the privacy section. @pmeenan - can you confirm?)

I agree that HTTP caches outside the browser may differ, but not sure how they relate to the Storage API.

annevk commented 1 year ago

"Cache" normally refers to the HTTP cache so that's a quite confusing set of sentences. If you mean that the dictionary is meant to be stored like other "storage", and thus would be subject to the "storage key", I think I understand and then this would make sense. It would have to be defined as a storage endpoint in that case, see the Storage Standard.

horo-t commented 1 year ago

Sorry for the confusion. Chromium is using usageDetails.caches for the usage of Cache Storage API.

I've already updated the Chromium implementation to use nested context origin. I think that dictionaries should be stored in the same way as other "storage". And in Chromium implementation, the dictionaries should be evicted using storage::QuotaManagerImpl's LRU ordered eviction logic.

pmeenan commented 1 year ago

As far as I know, the storage API manages application-controlled storage (e.g. the cache API) which is independent of the browser-managed HTTP cache. It has quotas, guarantees and is directly managed by the apps.

The dictionary data is effectively metadata on top of the browser-managed HTTP cache and shouldn't be counted towards the app quotas or be directly managed by the app. This is independent of if the browser ends up implementing it as duplicate items to keep it simple.

I don't think the HTTP cache is exposed through the storage API. Assuming it is not, the dictionary storage should also not be exposed (and should not count towards the quotas). Looks like I need to add some more details in the browser-specific section to make it clearer that they are HTTP cache and not application storage.

annevk commented 1 year ago

FWIW, I think if there is some way toward consolidation that would be nice, since the HTTP cache and storage should be cleared at the same time and such. But there's a somewhat complex set of questions around the keying and the evolution of that keying. Probably best tackled orthogonally.

horo-t commented 1 year ago

OK, I agree that if we don't provide web apps with an API for managing dictionaries, we don't need to expose its usage to your web app.