techfort / LokiJS

javascript embeddable / in-memory database
http:/techfort.github.io/LokiJS
MIT License
6.73k stars 482 forks source link

IncrementalIDBAdapter - Fix database corruption & Performance improvements #869

Closed radex closed 3 years ago

radex commented 3 years ago

IncrementalIDBAdapter has had a flaw. Saving only changed chunks (instead of dumping the whole in-memory database into IDB) can easily lead to database corruption if there's more than one writer (multiple tabs of the same web app).

I did consider this possibility when developing #808, but thinking about how this is meant to be used with an app that synchronizes (sync should make the in-memory state of multiple tabs exactly the same, avoiding inconsistency, and therefore corruption), I naively concluded that this won't be a problem. In hindsight, that's obviously wrong, and caused all sorts of issues for Nozbe Teams users. I was quite easily able to find multiple reproducible paths for corrupting the loki database. Most corruptions are subtle (e.g. binary index being slightly out of order) and don't manifest themselves so easily, but sometimes you can get a broken app unable to launch, as there is an order of events where two writers will save a record with the same id in multiple chunks (leading to a UniqueIndex throwing an error). #856 probably made this worse, as collection metadata save can be skipped.

The fix is quite straightforward and I could not come up with a way to break it:

       // To fix that, we save all
       // metadata chunks (loki + collections) with a unique ID on each save and remember it. Before
       // the subsequent save, we read loki from IDB to check if its version ID changed. If not, we're
       // guaranteed that persisted DB is consistent with our diff. Otherwise, we fall back to the slow
       // path and overwrite *all* database chunks with our version. Both reading and writing must
       // happen in the same IDB transaction for this to work.
       // TODO: We can optimize the slow path by fetching collection metadata chunks and comparing their
       // version IDs with those last seen by us. Since any change in collection data requires a metadata
       // chunk save, we're guaranteed that if the IDs match, we don't need to overwrite chukns of this collection

So the worst case scenario gets roughly the same performance as LokiIndexedAdapter, but most cases (one tab/no contention) have the same performance as before, though with slightly higher latency (we must fetch the loki chunk at each save, but it adds almost nothing to the time where the main thread is blocked, just a delay waiting for the concurrent IndexedDB process to give us an answer).

radex commented 3 years ago

@techfort This PR should be ready to review if you've got time :)

thomaschampagne commented 3 years ago

@radex Thanks for this work. @techfort Do you plan to perform a npm release with this improvement?

(I use this adapter in my project: https://champagnethomas.medium.com/elevate-desktop-app-2021-update-9155fd74bb85)

Thanks !

radex commented 3 years ago

@thomaschampagne Nice, cool to see that my LokiJS improvements aren't used only by WatermelonDB :) You can use @nozbe/lokijs NPM prereleases or make your own fork

thomaschampagne commented 3 years ago

Oh nice !!! I put that in my package.json right now !

techfort commented 3 years ago

1.5.12 published, thanks for the hard work guys

On Mon, 19 Apr 2021 at 21:24, Thomas Champagne @.***> wrote:

Oh nice !!! I put that in my package.json right now !

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/techfort/LokiJS/pull/869#issuecomment-822761013, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7QK7OBYCWJXIVXWGLV4LDTJSGQNANCNFSM4UKJHV7Q .