w3c / IndexedDB

Indexed Database API
https://w3c.github.io/IndexedDB/
Other
240 stars 62 forks source link

Projection Support #366

Open stefanhaustein opened 2 years ago

stefanhaustein commented 2 years ago

We have some tables that contain columns that can be relatively large, but these large columns aren't actually needed in many or our queries. Of course, we could split the tables, but this isn't desirable due to the additional management overhead and deviation from the expected / "natural" schema.

API Proposal

request = store.getAllProjection(keyPath, query [, count])
request = index.getAllProjection(keyPath, query [, count])

Retrieves partial data as specified by keyPath (as specified in section 2.5 of the IndexedDB spec) for the records matching the given key or key range in query (up to count if given).

This should enable significant performance gains at least in the case where keyPath can be satisfied from an index.

To my understanding, there is currently no way to obtain the index "content" by any other means (other than using cursors or fetching the full rows).

asutherland commented 2 years ago

I conceptually like this as a user of IDB, but with my (Firefox) implementer hat on, it's not clear that this could be specified or implemented in a way that would provide meaningful performance increases in a post-Spectre world. At least not without requiring massive increases in implementation complexity for browsers.

Right now I think it's preferable that more advanced use-cases like this be implemented in content, possibly using WASM, on top of existing/future block-storage APIs. The smaller API and implementation surface is better suited for cross-browser support in potentially new engines, while also being more likely to have consistent performance characteristics across browsers.

Structured Serialization Representation Limitations

The IDB data-model currently stores the values as largely opaque structured serialization blobs. Key paths are evaluated in the IDB-using global at add/update time before they go async when they're sent to whatever the underlying storage implementation is (frequently via IPC). The only case where IDB notionally evaluates key paths anywhere else is during index creation where existing values need to have their key paths extracted, which fundamentally requires the implementation to either spin up something that looks like a content global to load every value and extract key paths, or reuse the active upgrade transaction's global/etc.