cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Add support for rope.byte(n) #14

Closed gmorenz closed 6 years ago

gmorenz commented 6 years ago

It's possible I just missed something in the API, but I couldn't find a method that did this. The implementation and tests are copied pretty directly from the implementation of rope.char(n).

cessen commented 6 years ago

Thanks for the PR! It's great to have additional contributors. :-)

Can I ask what the use-case for this is? I can understand the "completionist" desire to add this, given that there's already Rope::char() and Rope::line(). But I also generally prefer to only add methods with clear use-cases, and I suspect that most byte-oriented processing of text would work directly with entire chunks of text at a time, or would use the byte iterator. Fetching individual bytes at a time would be pretty slow.

But I don't mean to be discouraging, the code looks great, and I appreciate your efforts! Just trying to figure out if it makes sense to add it or not.

(Also, we'll want to add it for RopeSlice as well if we decide to add it.)

gmorenz commented 6 years ago

I'm porting my editor from using a Vec<u8> backing store with byte offsets to using ropey. My motivation for implementing Rope::byte is just to allow iterative translation from my current byte/Vec oriented architecture to a rope/char oriented architecture.

Not merging this PR won't be a blocker for me in the slightest, because I can just port using a local fork and the project is still very small so I'll be able to get rid of all the Rope::byte calls quickly. I doubt I will be the only person who ever wants to be able to access things on a byte level though, and I don't see any other good way to access bytes.

I certainly understand the motivation behind keeping the library small though, so if you feel like this feature won't pull it's weight I can definitely respect that.

If you do feel like it's worth adding I'm happy to implement it on RopeSlice as well.

PS. I don't think it's particularly important, and I'm not planning on using it like this short term. But since you bring up performance, it occurs to me that this is probably the fastest way to check if there is an ascii character in a particular location. E.g. a \ before an already known ".

cessen commented 6 years ago

That seems pretty reasonable. I hadn't thought of progressive porting!

Okay, let's go ahead with this then. Go ahead and implement for RopeSlice as well, and then I'll merge.

As for performance, I'm sure there are niche use-cases for that kind of thing, where the extra performance over e.g. Rope::char() would be important. But those are the kinds of cases where I would expect people to use the lower level APIs (e.g. the chunk fetching APIs) to implement their own niche functionality on top of Ropey.

Put another way: I don't expect Ropey's higher level APIs to handle every possible use-case, just (ideally) the common-ish ones. And then niche use-cases can roll their own solutions on top. That's how I'm doing things in my own editor, for example.

gmorenz commented 6 years ago

Ok, added implementation for RopeSlice. Squashed commits for git history as well.

And ya, that attitude towards optimization tricks seems reasonable to me.

cessen commented 6 years ago

Looks great! Thanks for taking the time to do this!

Also, although in a PR is perhaps a slightly strange place to ask, how has your experience with Ropey been so far? I'm actively looking for feedback before going 1.0 (probably in a few months), so any feedback you have on the APIs, the documentation, etc. would be much appreciated!

gmorenz commented 6 years ago

I've barely got started with Ropey really, I'll have better feedback in a few days and update this then.

When glancing over it I was pretty impressed with the documentation and code quality. Hence why I decided to give it a shot. So far what I've seen actually using it has just re-enforced that it's pretty great (good job!).

API wise the only 'rough edge's I've hit so far have been this PR, and that I would expect what's currently called Rope::remove to be called something like Rope::remove_slice or Rope::remove_range, and Rope::remove to remove a single character (ran into this implementing delete/backspace). Not a terrible issue at all. That said I haven't explored much of the API yet because I've just done that partial port to Ropey using byte all over.

I'll add an update to this in a few days once I've explored more of the API.

cessen commented 6 years ago

Great! Looking forward to your feedback!

As for Rope::remove(), I tried to keep its naming parallel to Rope::insert(). The way I think of it is: insert() can insert multiple characters and remove() can remove multiple characters.

I could add a corresponding remove_char() to go along with insert_char(), but it would just be a wrapper for remove(n..(n+1)), so it's not clear to me how useful it would be in practice. (As an aside: insert_char() exists because there's a decoding step from char -> &str that's a little obtuse and annoying to do in-line, and another user of Ropey requested it.)

cessen commented 6 years ago

@gmorenz Any additional feedback yet? I'm thinking of making a 0.9 release soon, with some small API changes that are in master, but would like to roll in any relevant feedback as well if possible. :-)

gmorenz commented 6 years ago

@cessen Nothing constructive really, things basically work well and do what I would expect :+1: . I'd also like to call out that I haven't been working that much on my project that uses this so I still don't have a super complete image of the library in my head.

It does seem like that I'm likely to end up with a number of small reasonably generic helper functions (e.g. calculating start of line in character position, number of characters we are offset from start of line, etc). I think that some day as the library matures it would makes sense to include these things in the core library, but for now as the API is evolving I think it makes a lot of sense to leave them up to users to implement.

cessen commented 6 years ago

Okay, thanks much! Feel free to drop a line any time if you have feedback in the future. :-)

calculating start of line in character position

Does Rope::line_to_char() not do what you want? Sounds like exactly what you're describing, but maybe I'm misunderstanding. And char offset from the start of the line would just be ~char_idx - rope.line_to_char(char_idx)~. (edit: oops) char_idx - rope.line_to_char(rope.char_to_line(char_idx))

gmorenz commented 6 years ago

Feel free to drop a line any time if you have feedback in the future. :-)

Will do - and I want to say that I really appreciate how active you are in seeking feedback on this.

Does Rope::line_to_char() not do what you want?

I may have phrased that poorly, specifically what that function is is

let line_idx = self.contents.char_to_line(pos);
self.contents.line_to_char(line_idx)
cessen commented 6 years ago

Ah, okay. Yeah, makes sense. Also, just realized my char offset code above is completely wrong. Yikes.

Yeah, at some point it may make sense to add methods for that and other similar things. But I'd like to put those off until after I'm confident in a lean core that I can declare as 1.0.

Thanks again!