w3c / IndexedDB

Indexed Database API
https://w3c.github.io/IndexedDB/
Other
240 stars 62 forks source link

Define Indexed DB as a storage endpoint, use hooks #334

Open inexorabletash opened 4 years ago

inexorabletash commented 4 years ago

WORK IN PROGRESS - not ready to merge

For https://github.com/whatwg/storage/issues/90

Bikeshed hasn't picked up the new terms from Storage yet.

There should be no behavior changes here.


Preview | Diff


Preview | Diff

annevk commented 4 years ago

Looking at this made me notice the "connection queue", which should probably use the storage key as well rather than the origin? Is this a primitive that should move to the Storage Standard?

Also, part of the idea was that you'd no longer need "If origin is an opaque origin" as the Storage Standard would take care of that (and return failure as appropriate).

inexorabletash commented 4 years ago

Looking at this made me notice the "connection queue", which should probably use the storage key as well rather than the origin? Is this a primitive that should move to the Storage Standard?

Agreed it should be decoupled from origin (that's a good mental filter to use in general, thanks). Is it a generic enough concept to move to Storage? Other thoughts are to monkey-patch it on to storage bottle, or make it part of the bottle's contents (e.g. the bottle could have a "connection queue" key and a "databases" key, where the latter's value a map of the actual databases).

Also, part of the idea was that you'd no longer need "If origin is an opaque origin" as the Storage Standard would take care of that (and return failure as appropriate).

Agreed, I'll roll that in. Since that error is synchronous in IDB (well, in 2/3 cases), I'll have to rework the algorithms a bit but I think it's fine.

inexorabletash commented 4 years ago

Hmmm, actually there is a connection queue per name. So the bottle's map values can be a pair of (queue, database)....? Still thinking this through.

inexorabletash commented 4 years ago

And also note that we have no idea what should happen to pending open/delete requests if storage was swapped out. Are they associated with the previous storage? (so the queue is part of storage itself) Or do they apply to whatever the current storage is when they run? Brain hurts...

asutherland commented 4 years ago

And also note that we have no idea what should happen to pending open/delete requests if storage was swapped out. Are they associated with the previous storage? (so the queue is part of storage itself) Or do they apply to whatever the current storage is when they run? Brain hurts...

I think it makes sense for them to be associated with the previous storage.

Step 2 of the proposed replace algorithm at https://github.com/whatwg/storage/issues/18#issuecomment-614751336 is a task that runs on the given agent. It makes sense that the execution of this task would constitute the start of the new storage epoch, if you will. Requests made in the before times would be irrelevant.

inexorabletash commented 4 years ago

Cool. A few thoughts on how to structure the bottle map...

Pros and cons for each. Preferences? Other ideas?

asutherland commented 4 years ago
* _name_ → (_queue_, _database_)

I like this one because:

Blobs / Files

A related question is how IndexedDB-minted Blobs and Files will handle the replace operation and whether this impacts the map. Gecko definitely invalidates IndexedDB-minted Blobs and Files when Clear-Site-Data and privacy data-clearing operations occur. From other discussions in the past I have the impression this is also the case in Blink.

The File API Spec doesn't really get into this in the section on deserialization and the get stream algorithm which implies a simplified model where no effort is made to de-duplicate Blob contents or store them to disk, but does leave implementation a broad latitude to throw errors when get stream is invoked to compensate for the underlying realities.

It seems like we might want to formalize the realities of Blobs/Files now since Clear-Site-Data makes this previous edge-case something content can explicitly trigger instead of a user-initiated edge-case, plus multiple storage buckets presumably would also want to be able to dispose of the underlying blobs and their quota usage in a deterministic fashion.

Doing this might involve a hook where get stream could end up needing to involve some part of the storage hierarchy, in which case it's possible the map might need to store additional data to support this.

mkruisselbrink commented 4 years ago

Blobs / Files

FWIW, I'm working on clarifying this part of the FileAPI spec, with my current thinking being to let others (i.e. IndexedDB, other APIs that produce blobs) define a get stream hook, and then defining all the operations on blobs in terms of that. Unfortunately haven't had as much time to work on that as I would have liked, but it is among the higher priority of spec things I'm working on, also to better define how things work for the Native File System API.

So yes, in that model it would be totally up to IndexedDB to define when/how these blobs get invalidated.

inexorabletash commented 2 years ago

Partial update. I needed a name for the (queue,database) struct that exists in the map. I literally called it pumpkin here as a placeholder name because I wasn't feeling inspired. So bikeshed away!

This drops the need for most of the imports from Storage, although these are retained:

Most of the remaining references to "origin" end up being fairly illustrative rather than normative definitions. We could probably scrub most of them e.g. "if the origin’s storage is cleared" → "if the storage is cleared".

inexorabletash commented 1 year ago

See also: https://github.com/whatwg/storage/issues/153