automerge / automerge-classic

A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.
http://automerge.org/
MIT License
14.76k stars 467 forks source link

New, faster backend #436

Closed ept closed 3 years ago

ept commented 3 years ago

I got it working! Here are the new backend data structures that I've been working on for a long time. They more or less implement the approach I wrote up a few years ago.

The primary goal of this work is to make Automerge.load much faster and much more memory-efficient; secondary goal is to also make all other common operations (Automerge.change, Automerge.applyChanges, Automerge.save) significantly faster. The data formats and APIs remain unchanged. The dependency on immutable.js is gone.

Initial results are promising: on the automerge-perf benchmark this PR makes Automerge.load around 60x faster on my machine, and it uses 30x less memory than before. There are still ways of further optimising the implementation, but this should give us a good start. Moreover, I expect the new data structures will translate well into Rust (since they involve a lot of byte array manipulation), so we can expect further speedups from porting this work to the Rust/wasm backend.

The new implementation passes the existing Automerge test suite, and I've also added a bunch of new tests, which give us a reasonable degree of confidence that it's correct. However, it's a lot of new code (including some very subtle, fiddly code), so we should expect that there will be some new bugs. For that reason I'd appreciate if you, the community members, could kick the tires and check whether the new code works for you.

One thing you can do to check whether everything is okay with a document doc is to run the following from time:

Automerge.getAllChanges(Automerge.load(Automerge.save(doc)))

Think of that as a kind of "fsck" analogue. If it doesn't throw an exception, it means all of the internal consistency checks passed. If it does throw an exception, please try to construct an example that reproduces it and file a bug report. The getAllChanges call can be slow on a document with a large number of changes, but you shouldn't need to call it in normal operation — this is just for finding bugs while the code is new.

Thanks all for your patience while I was working on this, and let me know if this works for you.

ept commented 3 years ago

To make it easier to try the new backend, I've merged it into main and released version 1.0.1-preview.5 to npm. We can use separate PRs to address any issues that arise.