whatwg / storage

Storage Standard
https://storage.spec.whatwg.org/
Other
126 stars 55 forks source link

Rethinking storage proxy map #96

Open annevk opened 4 years ago

annevk commented 4 years ago

One thing I noticed while working on https://github.com/whatwg/html/pull/5560 is that we don't have a nice formalized way to deal with bottle/proxy map operations failing. And I think in principle all can fail for a variety of reasons.

asutherland commented 4 years ago

The LocalStorage case being dealt with in https://github.com/whatwg/html/pull/5560 isn't synchronously dealing with the authoritative map, it's dealing with a replicated copy of the map, but that's largely hand-waved away via the "multiprocess" disclaimer. Perhaps the hand-waving should be reduced and that will help clear up the error handling[1]?

I think the inescapable implementation reality is that there are always going to be at least 3 event loops involved for any storage endpoint and it could be worth specifying this:

  1. The event loop hosting the authoritative storage bottle map for the endpoint for the given bucket. (Which may be different than the event loop for buckets on the same shelf or on different shelves, etc.)
  2. One or more event loops processing I/O operations for the storage bottle map. (Or put another way, for performance reasons, implementations will not/cannot be required to serialize storage API decisions based in a blocking manner on disk I/O.)
  3. The event loop for the agent where the API calls are happening.
  4. (There might also be separate event loops for the authoritative storage bucket map and higher levels, but those don't matter for bottle map errors unless they are fatal.)

Although there will always be policy checks that can happen in the agent event loop that are synchronous, the reality is that most unexpected failures will happen in the I/O event loops and these will then want to notify the authoritative storage bottle map.

Especially given that there's interest in the Storage Corruption Reporting use-case (explainer issue in this repo, this async processing would make sense as any corruption handlers would want to be involved in the middle of the process.

One might create the following mechanisms:

For all Storage endpoints, the question whenever any error occurs on the I/O loop or when ingesting data provided by the I/O loop is: Does this break the bottle?. For the "indexedDB", "caches", and "serviceWorkerRegistrations" endpoints there are already in-band API means of relaying I/O failures (fire an UnknownError or more specific error, reject the promise, reject the promise) and there's no need to break the bottle. For "localStorage" and "sessionStorage" there's no good in-band way to signal the problem, but any transient inability to persist changes to disk can be mitigated by buffering and when the transient inability becomes permanent, the bottle can be said to be broken.

1: From a spec perspective (ignoring optimizations), Firefox's LocalStorage NextGen overhaul can be said to synchronously queue a task to make a snapshot of the authoritative bottle map on the authoritative bottle map event loop the first time the LocalStorage API is used in a given task on the agent event loop. The snapshot is retained until the task and its micro-task checkpoint completes, at which point any changes made are sent to the authoritative bottle map in a task where they are applied. This maintains run-to-completion consistency (but does not provide magical global consistency). There are other possible implementations like "snapshot at first use and broadcast changes" which could also be posed in terms of the event loops/task sources.

annevk commented 4 years ago

There's also "does this fit in the bottle?" I suppose, which does happen to fail synchronously for localStorage and sessionStorage (though as specified only for a single method), but presumably based on a thread-local understanding of the status quo.

asutherland commented 4 years ago

Yeah, I was lumping the LocalStorage/SessionStorage quota checks into agent-local policy decisions along with structured serialization refusing to serialize things (for other storage endpoints). For LocalStorage/SessionStorage the quota check need to happen synchronously (and structured serialization is not involved for them).

Impl-specific notes: For Firefox's LSNG the agent can be said to hold a quota pre-authorization like used for credit/debit cards. If a call needs more space than was pre-allocated, a task is synchronously dispatched from the agent event loop to the authoritative bottle map's event loop in order to secure the added quota.