Buffering writes for performance uplift

I'm not sure where to put this - its a bit of a stretch to make this an issue. But it might be something you want to add to ropey.

I've been experimenting with pre-write concatenation of edits to see how much of a difference it makes to performance. The use case here is replaying editing histories in a CRDT I'm working on, and outputting the resulting string. (I'm storing the editing history anyway, so if I can do this quickly I can skip also storing the resulting document).

To improve performance I'm essentially wrapping jumprope in a little wrapper structure which greedily concatenates all incoming changes when they're adjacent and the same type. So instead of insert 'a' then insert 'b' the changes become "insert 'ab'*. Obviously this only works when the changes are the same type, and adjacent. But the performance figures when using real world editing histories is pretty unreal. I'm seeing a 3-10x performance improvement from this change.

The automerge-perf dataset shows the biggest improvement, because each edit in that dataset is a single keystroke. This concatenation approach brings the number of "real edits" to the structure down from 260k to 10k, at the expense of a bit of bookkeeping. In aggregate performance improves by 9.5x. I'm now replaying this 260k edit history in 670 microseconds (!), or 390M keystrokes per second. (Up from 40M/sec, which is already obnoxiously fast.)

Dataset	Edit Direct (ms)	Edit Buffered (ms)	Performance multiplier
automerge-perf	6.36	0.67	9.55x
seph-blog1	3.61	0.97	3.72x

Anyway, this approach is only useful with specific use cases, and it won't help at all with random editing traces.

Its also entirely rope-agnostic. Ropey would see a similar performance improvement with these data sets.

So yeah, I'm not sure if you care about this, but its interesting!

Buffered rope wrapper here

cessen / ropey

Buffering writes for performance uplift #55