cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Buffering writes for performance uplift #55

Closed josephg closed 2 years ago

josephg commented 2 years ago

I'm not sure where to put this - its a bit of a stretch to make this an issue. But it might be something you want to add to ropey.

I've been experimenting with pre-write concatenation of edits to see how much of a difference it makes to performance. The use case here is replaying editing histories in a CRDT I'm working on, and outputting the resulting string. (I'm storing the editing history anyway, so if I can do this quickly I can skip also storing the resulting document).

To improve performance I'm essentially wrapping jumprope in a little wrapper structure which greedily concatenates all incoming changes when they're adjacent and the same type. So instead of insert 'a' then insert 'b' the changes become "insert 'ab'*. Obviously this only works when the changes are the same type, and adjacent. But the performance figures when using real world editing histories is pretty unreal. I'm seeing a 3-10x performance improvement from this change.

The automerge-perf dataset shows the biggest improvement, because each edit in that dataset is a single keystroke. This concatenation approach brings the number of "real edits" to the structure down from 260k to 10k, at the expense of a bit of bookkeeping. In aggregate performance improves by 9.5x. I'm now replaying this 260k edit history in 670 microseconds (!), or 390M keystrokes per second. (Up from 40M/sec, which is already obnoxiously fast.)

Dataset Edit Direct (ms) Edit Buffered (ms) Performance multiplier
automerge-perf 6.36 0.67 9.55x
seph-blog1 3.61 0.97 3.72x

Anyway, this approach is only useful with specific use cases, and it won't help at all with random editing traces.

Its also entirely rope-agnostic. Ropey would see a similar performance improvement with these data sets.

So yeah, I'm not sure if you care about this, but its interesting!

Buffered rope wrapper here

cessen commented 2 years ago

Ah! That's cool. Yeah, this is definitely something that would be done outside of Ropey, by whatever client code is calling for the edits. I think Helix, for example, already does this where appropriate.