CodeEditApp / CodeEditSourceEditor

A code editor view written in Swift powered by tree-sitter.
https://codeeditapp.github.io/CodeEditSourceEditor/documentation/codeeditsourceeditor
MIT License
507 stars 77 forks source link

✨ Editing Performance #132

Closed thecoolwinter closed 6 hours ago

thecoolwinter commented 1 year ago

Editing a large file can take up to multiple seconds per character. Right now CodeEditTextView uses the default NSTextStorage as a text store, which stores text contiguously in memory. This means that for each insert all characters after the insert location must be copied one spot over in memory to make room for the inserted character.

For the most part this isn't a problem for smaller files. But opening a file like sqlite.c (~151,000 LOC) the performance impact becomes very noticeable. I've attached a screen recording of typing a few characters into this file, it's easy to see how long it takes.

https://user-images.githubusercontent.com/35942988/214759044-5ae45760-fe25-46ee-b81c-206a01c5aff8.mov

The way to fix this issue is to implement a custom NSTextStorage object using a data structure more suited to a text editor. By default, NSTextStorage uses an NSMutableAttributedString to store both text and attributes. These implementations are very general and don't perform well with edits.

Therefore the problem is twofold: text storage and attribute storage.

There are a few algorithms that are generally used in text editors for storing text:

In my opinion, a piece table would be the best option. A gap buffer may be a solid solution, but doesn't offer the same append-only type performance a piece table can offer. With a piece table every edit is added to the text buffer as an immutable, appended, chunk of text which completely removes the need to copy text when inserting characters.

The only downside of a piece table comes around when a lot of edits happen as found by the VSCode team.

The other problem is attribute storage. Attributes consist of a range (UInt-UInt) and a dictionary of attributes NSDictionary. A solution for attribute storage may be to implement a tree that stores ranges as it's nodes and allows for O(log n) lookup time for a given index. However, a tree structure will have a large space requirement, and ranges may be very small (or even a single index) when we're talking about syntax highlighting, leading to many nodes in a tree.

Another option would be to somehow store a small amount of metadata about each range alongside the string itself. We already have a ThemeAttributesProviding protocol that can be used to convert a capture type into attributes. We could store the capture type (sqrt(21 available captures) = ~5 needed bits, rounded up) using 1 extra byte for every highlighted range and when asked for attribute data convert the capture into attributes.

Any thoughts or discussion on implementation and performance is welcome!

thecoolwinter commented 1 year ago

Also a custom NSTextStorage implementation would allow us to do things like stream files from disk instead of loading them atomically into memory first.

lukepistrol commented 1 year ago

This may be total garbage (I haven't looked into it yet) but couldn't we just reuse/extend the tree structure already provided by tree-sitter? Or would you rather keep them separate since we might use additional highlighters/parsers in the future?

thecoolwinter commented 1 year ago

I'd like to keep them separate for that reason and for any files that may not be supported by tree-sitter.

neil-and-void commented 1 year ago

I'm not sure where to put this but I have performance issues just opening and trying to view a large file (~10k lines of code)

https://user-images.githubusercontent.com/46465568/216724304-56408a6f-db54-4c48-92fb-6a6ff47c4b6e.mov

thecoolwinter commented 1 year ago

@neilZon with #138 merged opening and scrolling (slowly) should be much faster. There's some issues on the CodeEdit side with opening the file, but it will no longer hang and explode in memory.

thecoolwinter commented 1 year ago

Upon further investigation, it looks like the culprit is more likely tree-sitter. Doing some benchmark tests tree-sitter can take 100's of milliseconds to edit very large documents. NSTextStorage with attributes can take a few milliseconds to edit large documents, so this is maybe related to #143.

thecoolwinter commented 1 year ago

Here's an example printing the time between editing a document and finishing highlighting the edit (in ms).

https://user-images.githubusercontent.com/35942988/222305486-3ad4807e-e8f2-4ba2-a190-ae5827df4f66.mov

matthijseikelenboom commented 1 year ago

Do we still load the full content of the file into the (scroll)view? Or do load the content that is visible + a (scrolling)margin? I don't know if that has anything to do with it, but I can imagine so

thecoolwinter commented 1 year ago

@matthijseikelenboom We load the whole document but all edits + highlights are restricted to the visible area.

serg-vinnie commented 1 year ago

this implementation might help you https://github.com/GitHawkApp/StyledTextKit, it's iOS-only, but anyway

thecoolwinter commented 1 year ago

this implementation might help you https://github.com/GitHawkApp/StyledTextKit, it's iOS-only, but anyway

@serg-vinnie That might be useful, but since the slow part of the problem isn't rendering or creating the attributed text I'm not sure how much it'll help.

austincondiff commented 8 months ago

@thecoolwinter is this still an issue since merging the new text view?