cessen / ropey

A utf8 text rope for manipulating and editing large texts.
MIT License
1.04k stars 46 forks source link

Add a feature flag to only count LF and CRLF as line breaks. #53

Closed cessen closed 2 years ago

cessen commented 2 years ago

Most other software does not conform to the Unicode spec when it comes to line breaks. This (understandably) gives people pause when deciding whether to use Ropey or not, because although Ropey is "correct" in the way it counts line breaks, it's not compatible with other software if any line breaks other than LF/CRLF are present in a file.

We can solve this with a feature flag that implements the simpler, non-Unicode-conformant line break counting that is common to most other software.

archseer commented 2 years ago

Also related to #43

cessen commented 2 years ago

Oh! Right. I feel silly now. Basically a duplicate.

cessen commented 2 years ago

I split Ropey's str_utils module off into a separate crate, str_indices, which now also implements two additional ways of line counting:

That was the "hard" part. The rest should be relatively straightforward:

  1. Add the feature flags for these two alternative approaches, which just swap out which functions are imported from str_indices.
  2. Update all of the line-based unit tests and property tests to verify against the output of those imported functions rather than hard-coded expected results.

Although straightforward, there are a lot of unit tests to update. So it's a bit of a project still.

cessen commented 2 years ago

Added in 38531e083f8cf5c5b1aeb328361895fb5a9b867c.

There are two new feature flags:

Both features are enabled by default for backwards compatibility, but can be disabled to strip things down. Note that unicode_lines implies cr_lines, since the former is a superset of the latter.

cessen commented 2 years ago

On second thought: re-opening, just to remind myself to add proper documentation for this before the next release.

cessen commented 2 years ago

Documentation done.