gchp / iota

A terminal-based text editor written in Rust
MIT License
1.63k stars 81 forks source link

fix some issues with drawing a view in standard mode #103

Closed ghost closed 9 years ago

ghost commented 9 years ago

Hi,

This fixes some issues where a view is incorrectly drawn while editing in standard mode:

There are still some problems which I notice when editing unicode characters. It looks like the GapBuffer<u8> is being treated as an array of codepoints rather than an array of bytes. Of course, this doesn't work for codepoints greater than 255. Is the gap buffer meant to store codepoints? If so, would it be a good idea to change it to a GapBuffer<char>?

I've also edited the Makefile to work when cargo isn't located in /usr/local/bin.

crespyl commented 9 years ago

I'm not @gchp, but thanks for this!

Regarding the u8 vs char thing; yeah it's sort of a problem inherent to dealing with unicode. The char type is (IIRC) always large enough to hold any valid codepoint, and thus always takes up 4 bytes of space. In the common case of editing mainly ASCII text, that ends up being a lot of wasted space, hence the use of UTF-8. The standard Rust String works the same way, using a Vec<u8> and adding a bit of processing overhead to ensure Unicode correctness.

The problem with UTF-8 is that buffer/byte indices no longer map 1-1 onto character/grapheme indices, which makes finding the byte offset of the n-th character in the buffer a non-trivial operation. Iota's in a kind of awkward place right now where some places want to index the buffer by byte offsets, and other places want to index by character offsets. I think this is what you were seeing.

Ideally, I think we probably want to stick with a UTF-8 byte buffer, and have the data structure provide clearer accessors for byte/codepoint/grapheme level indexing. There's been some discussion of ropes and finger-trees which are both fairly appealing.

ghost commented 9 years ago

@crespyl Thanks for explaining! I'm interested in people's thoughts about what kind of data structure will be used.

In the "gitter" thing in the readme, someone mentioned that using a persistent data structure would simplify undo/redo operations. Unfortunately, persistent ropes are hard to implement (I think) because the number of characters underneath each node is typically memoized in the node, meaning that when a leaf is updated each node between it and the root must be mutated, resulting in a lot of copying. I think it could be interesting to implement a piece table in rust like this. I find it difficult to write efficient data structures in rust because of it's lack of C-like pointers outside of unsafe blocks. I probably just need to practice more. :P

Another thing to consider is having the data structure store the data line-by-line, or otherwise somehow being able to look up text by line number in constant time. This would probably be very efficient while doing vim-like operations such as "delete entire line" or "go to first non-whitespace character in previous line".

gchp commented 9 years ago

Awesome, thank you for this!

gchp commented 9 years ago

Ah yeah, the utf8 stuff is on-going. @crespyl pretty much covered where we're at right now! Investigating possible solutions. Honestly, its a first for me, so its taking me a while to get it figured out, but we will get there! All comments/suggestions welcome in the mean time, though.

Thanks again for this!