I'm not sure where to put this - its a bit of a stretch to make this an issue. But it might be something you want to add to ropey.
I've been experimenting with pre-write concatenation of edits to see how much of a difference it makes to performance. The use case here is replaying editing histories in a CRDT I'm working on, and outputting the resulting string. (I'm storing the editing history anyway, so if I can do this quickly I can skip also storing the resulting document).
To improve performance I'm essentially wrapping jumprope in a little wrapper structure which greedily concatenates all incoming changes when they're adjacent and the same type. So instead of insert 'a' then insert 'b' the changes become "insert 'ab'*. Obviously this only works when the changes are the same type, and adjacent. But the performance figures when using real world editing histories is pretty unreal. I'm seeing a 3-10x performance improvement from this change.
The automerge-perf dataset shows the biggest improvement, because each edit in that dataset is a single keystroke. This concatenation approach brings the number of "real edits" to the structure down from 260k to 10k, at the expense of a bit of bookkeeping. In aggregate performance improves by 9.5x. I'm now replaying this 260k edit history in 670 microseconds (!), or 390M keystrokes per second. (Up from 40M/sec, which is already obnoxiously fast.)
Dataset
Edit Direct (ms)
Edit Buffered (ms)
Performance multiplier
automerge-perf
6.36
0.67
9.55x
seph-blog1
3.61
0.97
3.72x
Anyway, this approach is only useful with specific use cases, and it won't help at all with random editing traces.
Its also entirely rope-agnostic. Ropey would see a similar performance improvement with these data sets.
So yeah, I'm not sure if you care about this, but its interesting!
Ah! That's cool. Yeah, this is definitely something that would be done outside of Ropey, by whatever client code is calling for the edits. I think Helix, for example, already does this where appropriate.
I'm not sure where to put this - its a bit of a stretch to make this an issue. But it might be something you want to add to ropey.
I've been experimenting with pre-write concatenation of edits to see how much of a difference it makes to performance. The use case here is replaying editing histories in a CRDT I'm working on, and outputting the resulting string. (I'm storing the editing history anyway, so if I can do this quickly I can skip also storing the resulting document).
To improve performance I'm essentially wrapping jumprope in a little wrapper structure which greedily concatenates all incoming changes when they're adjacent and the same type. So instead of insert 'a' then insert 'b' the changes become "insert 'ab'*. Obviously this only works when the changes are the same type, and adjacent. But the performance figures when using real world editing histories is pretty unreal. I'm seeing a 3-10x performance improvement from this change.
The automerge-perf dataset shows the biggest improvement, because each edit in that dataset is a single keystroke. This concatenation approach brings the number of "real edits" to the structure down from 260k to 10k, at the expense of a bit of bookkeeping. In aggregate performance improves by 9.5x. I'm now replaying this 260k edit history in 670 microseconds (!), or 390M keystrokes per second. (Up from 40M/sec, which is already obnoxiously fast.)
Anyway, this approach is only useful with specific use cases, and it won't help at all with random editing traces.
Its also entirely rope-agnostic. Ropey would see a similar performance improvement with these data sets.
So yeah, I'm not sure if you care about this, but its interesting!
Buffered rope wrapper here