Update benchmarks in light of recent perf improvements in automerge 2.x (automerge-wasm 0.9.0)

kid-icarus commented 7 months ago

Is your feature request related to a problem? Please describe.

The current benchmarks in this repo use automerge-wasm 0.1.0, and there have been several releases since then along with several blog posts on performance improvements.

Since this benchmark is linked from the main Yjs repository, it'd be nice to update this so as to not set false expectations in light of the current state of Automerge.

dmonad commented 7 months ago

That's fair. I didn't upgrade Automerge for over a year.

I upgraded to @automerge/automerge@v2.1.10, which uses @automerge/automerge-wasm. The new results are now reflected in the readme.

I understand that benchmarks are not always reproducible. But according to the blog post, time and parseTime are way off.

time is 174s, but should be 1.8s
parseTime is 3.8s but should be 0.6s

Maybe performance has degraded considerably between 2.0.0 and 2.1.10. Is this something you should check @pvh & @ept ?

https://github.com/dmonad/crdt-benchmarks/compare/12b4e90db7fe9af5b4544225b62320c3edb5fffb..e657427b5e9cfc9f1748aaef2c4c6c7ad7a68582

New results

N = 6000	yjs	ywasm	automerge
Version	13.6.11	0.9.3	2.1.10
Bundle size	80413 bytes	799327 bytes	1737571 bytes
Bundle size (gzipped)	23571 bytes	232727 bytes	604118 bytes
[B4] Apply real-world editing dataset (time)	1803 ms	43943 ms	174269 ms
[B4] Apply real-world editing dataset (encodeTime)	12 ms	4 ms	425 ms
[B4] Apply real-world editing dataset (docSize)	159929 bytes	159929 bytes	129116 bytes
[B4] Apply real-world editing dataset (parseTime)	38 ms	17 ms	3819 ms
[B4] Apply real-world editing dataset (memUsed)	3.5 MB	856 B	1.3 MB

Old results

N = 6000	yjs	ywasm	automerge-wasm
Version	13.5.12	0.9.3	0.1.3
Bundle size	80413 bytes	799327 bytes	880281 bytes
Bundle size (gzipped)	23571 bytes	232727 bytes	312499 bytes
[B1.1] Append N characters (time)	179 ms	169 ms	164 ms
[B4] Apply real-world editing dataset (time)	2142 ms	52650 ms	2074 ms
[B4] Apply real-world editing dataset (encodeTime)	22 ms	3 ms	205 ms
[B4] Apply real-world editing dataset (docSize)	159929 bytes	159929 bytes	129098 bytes
[B4] Apply real-world editing dataset (parseTime)	46 ms	16 ms	1108 ms
[B4] Apply real-world editing dataset (memUsed)	2.4 MB	176 B	0 B

dmonad commented 7 months ago

Thanks to @alexjg PR #22, the time result improved significantly. However, these numbers are still far from the results mentioned in the blog post.

Results for automerge.next

N = 6000	yjs	ywasm	automerge
Version	13.6.11	0.9.3	2.1.10
Bundle size	80413 bytes	799327 bytes	1737571 bytes
Bundle size (gzipped)	23571 bytes	232727 bytes	604118 bytes
[B4] Apply real-world editing dataset (time)	1803 ms	43943 ms	13853 ms
[B4] Apply real-world editing dataset (encodeTime)	12 ms	4 ms	379 ms
[B4] Apply real-world editing dataset (docSize)	159929 bytes	159929 bytes	129116 bytes
[B4] Apply real-world editing dataset (parseTime)	38 ms	17 ms	3410 ms
[B4] Apply real-world editing dataset (memUsed)	3.5 MB	856 B	0 B

alexjg commented 7 months ago

I think this is to do with running each change in an individual automerge.change callback rather than all at once. If I run the edit trace all in one change callback I get ~500ms for time.

I actually have a suspicion that this is a bug in automerge as we should not be committing changes at the end of the change callback unless we need to do something like encode the changes to send them somewhere. Let me have a dig.

dmonad commented 7 months ago

That's it, @alexjg !

When I wrap all changes inside of automerge.change, it only takes 1066ms, which is close enough.

These benchmarks are supposed to simulate users performing various tasks. The b4 benchmark is supposed to emit individual update events for every single change (they also do so in Yjs). Performing 160k individual edits in a single transaction is far from how a CRDT would be used in practice. Instead, the user could perform the insertion as a single insert operation. So, I will keep the current benchmarking approach as is. But I will make a note in the readme.

The last thing that puzzles me is parseTime. It is now at 3.4s, but it should be closer to 438ms.

alexjg commented 7 months ago

But I will make a note in the readme.

Appreciated!

I suspect the parseTime part is due to the fact that we have a fast path when you are loading a document from scratch but the document in the benchmark which we load the document into already has the initialDoc contents so the fast path isn't taken.

dmonad commented 7 months ago

Alright, getting closer!

I now measure parseTime by calling automerge.load(). parseTime is now at 2.1s.

I'm gonna close this ticket. But let me know if you find more things to fix!

dmonad / crdt-benchmarks