icsharpcode / AvalonEdit

The WPF-based text editor component used in SharpDevelop
http://avalonedit.net/
MIT License
1.85k stars 469 forks source link

New Text Buffer Proposal #381

Closed hummy123 closed 1 year ago

hummy123 commented 1 year ago

Hi there.

I was developing an alternative text buffer for my own purposes (a Piece Table rather than a Rope) and it seemed to improve on AvalonEdit's in a few scenarios and I would be happy to plug it into the interfaces here and add the functionality (string search/indexOf) this buffer has that mine doesn't at the moment if the maintainers would accept this change.

Here is a link to a reproducible benchmark comparing the Rope that AvalonEdit uses and the Piece Table I created.

Here are the benefits I see from switching:

Here are some other things worth considering in favour of my Piece Table:

Here are the cons I see from adapting the Piece Table:

Let me know your thoughts and I'll fork the repository and make the necessary changes if the proposal sounds good to you.

Kind regards, Humza

dgrunwald commented 1 year ago

I haven't checked the benchmark in detail, just based on your comment:

AvalonEdit's "rope" doesn't deep clone -- it only marks nodes as shared, and ends up cloning only those nodes that are modified. In SharpDevelop (which AvalonEdit was originally written for) we used frequent snapshots for code analysis on background threads -- in fact this is why I switched AvalonEdit to this "rope" to begin with; originally it used a simple gap text buffer. The mutability was meant as an optimization when there's multiple edits without a Clone() call in between. (consider especially the case: selecting the whole document, then pressed Tab to insert indentation at the start of each line).

Substring+GetCharAt are pretty frequently used by AvalonEdit, the highlighting engine, the rendering etc.

I don't like the extra complexity of adding F#, requiring multiple DLLs, etc. I would not accept a pull request doing this.

hummy123 commented 1 year ago

Oh, I see. Thanks for correcting my misconceptions about the cloning.

That's a very reasonable explanation and I'm happy to close this issue as it doesn't seem a good fit.

goswinr commented 1 year ago

@hummy123 I am using F# too. I'm actually writing F# using AvalonEdit. I might want to test your Piece Table in my own fork of AvalonEdit. Do you have the source available? The above link is broken.

@dgrunwald I am using CreateSnapshot() on every Document.Changed event to update semantic highlighting via the F# compiler service. When keystrokes come very fast in larger documents, sometimes even the latest created Snapshot seems to miss the latest few characters. Is this a known limitation? Or am I doing something wrong? Could it be that this happens because there might be concurrent calls to CreateSnapshot()? Edit: was my bug.

hummy123 commented 1 year ago

@goswinr Hi there. I actually completed the Piece Tree (the data structure you get when putting a Piece Table's nodes in a balanced binary tree like VS Code and AbiWord do), but found that the structure has very poor substring performance in comparison to Ropes.

The reason is clear when you consider that a Piece Tree can reconstruct the original string through an in-order traversal. Say that you want a substring range from one character to the left of the root node to one character after the root node. You basically have 2 log(n) queries to perform and it can get especially costly over time as pieces fragment. I don't think the advantages over Ropes (a nice serialisation format and increased memory sharing of strings) is worth it.

I ended up coding, earlier this year, a very fast immutable Rope implementation in OCaml (which is nearly syntactically identical to F#). It's missing a few trivial functions that AvalonEdit's Rope has, but I would be happy to port it over to F# once the OCaml implementation has them. (A bit busy at the moment because of work.)

hummy123 commented 12 months ago

@goswinr Here you go. (Licensed under 0BSD, which is equivalent to a public domain license requiring no attribution.) It's mostly a copy-and-paste of the OCaml code I already have.

https://github.com/hummy123/brolib-fs

It implements most of the functions from CharRope.cs, except for the IO ones (like writing to a TextReader). The functions that are meant to be public contain comments above them.

I don't have an implementation of FillNode though, because that's specific to the rope structure there.

I do have (in OCaml, not F#) the ability to retrieve lines by line number in the Rope, but that increases the implementation complexity quite a bit (and insertion/deletion time also increase a small amount). I didn't implement it here though since it looks like AvalonEdit has a separate DocumentLine class for line queries.

For benchmarking the OCaml implementation of the Rope, I generated (files containing an aray of very large tuples)[https://github.com/hummy123/brolib/tree/main/bin] (sveltecomponent.ml, sephblog.ml, rustcode.ml and automerge.ml) which are sourced from here. (These files are valid F# files too, if you change the extension from .ml to .fs .)

I ran through the traces by folding over the array like this, and executing the f_ins and f_del functions when the current tuple signals to insert or delete a character. So, if you want to compare the performance with the rope in AvalonEdit, you can do the same.

The edit traces only test the speed of insertions and deletions, though. It might be a good idea to run substring on the rope you get after running the edit traces, to see how the performance compares. (I think substring queries tend to dominate most of the time in editor applications, as opposed to modifications like insertion and deletion.)

I think that's everything that needed saying. Let me know if you run into any troubles or difficulties with the Rope or if you would like an explanation of how it works so you can extend it yourself. (My email is at the license file you'll see on clicking the very first link in this post.)

goswinr commented 12 months ago

Thank you @hummy123 brolib looks nice and elegant! So far the Rope in Avalonedit is actually fine for me. My performance problems actually come from rendering. And that seems to be inherent to WPF.