cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Insert to another rope #57

Closed stevefan1999-personal closed 2 years ago

stevefan1999-personal commented 2 years ago

I know that you can combine/insert a string to a rope in O(log n) time, but I wonder if it is possible to reuse the information from another rope, rather than turning it into string and then insert it back, i.e. tree merge algorithm is probably needed.

Related: #4

This is needed for a preprocessor algorithm, that we could map the parent node of any rope to a specific pointer (such as file) for better diagnostic when parser goes wrong. Another feature needed is a splice algorithm which requires this but let's keep it away from here for the next time.

cessen commented 2 years ago

The simple answer: you can use Ropey's split_off() and append() methods to insert one rope into another, reusing the memory of the nodes where possible. First split the target rope at the insertion point, then append the other rope to the first split, and finally re-append the second split.

This is needed for a preprocessor algorithm

It sounds like you're trying to use Ropey for the internals of a compiler, or something similar...? Ropey probably isn't a good fit for that, and in any case it isn't Ropey's target use case.

You're welcome to keep trying, of course. And if Ropey's existing functionality suits what you're trying to do, great. But just keep in mind that Ropey isn't really intended for that, and I'm not specifically trying to support that use case.

that we could map the parent node of any rope to a specific pointer (such as file) for better diagnostic when parser goes wrong.

I'm not sure I totally follow what you're saying. Do you mean you're somehow trying to rely on Ropey's internal node structure?

stevefan1999-personal commented 2 years ago

It sounds like you're trying to use Ropey for the internals of a compiler, or something similar...? Ropey probably isn't a good fit for that, and in any case it isn't Ropey's target use case.

I would say it probably is. I need it for effortless preprocessor caching that you might have a lot of include headers...which means the file could expand to megabytes if not gigabytes. Even a simple "hello world" program, gcc expand it to more than 200KB for me, so I think this is a very good use case for ropey.

I'm not sure I totally follow what you're saying. Do you mean you're somehow trying to rely on Ropey's internal node structure?

Just blindly splicing the header file in won't give us any reliable diagnostic information regarding the header. Let's say I will have a bimap::BiHashMap<String, ropey::Rope> that maps include files to an individual rope in bijection (if the file specified #pragma once), so we could find out what file it is and where the error location is relative to the file much easier.

cessen commented 2 years ago

I guess it's just not clear to me what the benefit of Ropey is in this case over e.g. writing a simple custom piece table. It would be far less total code than Ropey, and would be much more reliable at reusing memory. It would also make it easy to find out where any given piece of text came from.

And since you'll know exactly how the compiler will use it, you can probably simplify the piece table implementation to only handle the situations the compiler needs. I suspect you could implement it in a few hundred lines of code or so, by narrowly targeting your specific use case.

Whereas I suspect you'll need to jump through a lot of hoops to get similar-but-lesser benefits from Ropey. Ropey is good at what it does, but what it does isn't really aimed at the needs of compilers. For example, the data sharing it does isn't reliable when making a lot of distributed edits to a document, as often happens with preprocessing.

cessen commented 2 years ago

Closing, because it's been a month, and I believe the initial request was filed based on a misunderstanding of Ropey's capabilities.