Support soft wrap - Githubissues

kellytk commented 3 years ago

As discussed on Matrix.

The default behavior of Kakoune to not line wrap is confusing. Typically I'd appreciate the austere design, where until a newline is encountered, do not wrap. However I think this is an exceptional case, because there are constraints on the doc outside of the user's control, namely the viewport. There is a 'size' constraint imposed on the user, and given that, I'd default to wrapping in Helix.

cessen commented 3 years ago

I'd tentatively like to take this on, based on my experience with Led. But in case anyone gets to it before me, I want to leave some notes about Led's approach.

Here's a short demo of Led in the face of large files and crazy-long lines:

http://perm.cessen.com/2021/helix/led_demo.mp4

The core principle that enables this is to always do things in ways that keep calculations local to a given area of the text. The specific techniques I've used in Led are:

The view position is specified by a simple char-offset into the text, not by visual line. This lets the display code jump directly to the vicinity of the content that should be on screen, without needing any information about how it will be displayed. The wrapping etc. can then be done locally afterwards, in that vicinity, and everything positioned on screen based on that. This is efficient enough to be done on-the-fly every time it's needed (including for e.g. vertical cursor movement calculations), with no caching needed. However, with this alone it still needs to calculate soft-wrapping starting at the beginning of the line that contains the content, which obviously won't work in real time for very long lines. And that brings us to...
Chunking overly-long lines. This can be done with only local calculations as well, including a search for a good break point (e.g. white space, or at least grapheme boundary) in the vicinity of the proposed chunk boundary. A soft line break is then made at the end of every chunk. This places an upper-bound on how far back the soft wrapping code needs to go: at worst, it's the length of a chunk. (In Led, the chunk size is hard-coded to 4096 chars, but it would be easy to make configurable.)

The down sides to this approach are that A. the editor has no concept of absolute visual vertical position for use with e.g. scroll bars, and B. there are periodic soft line breaks at the chunk boundaries of over-long lines.

I don't think issue A is a problem for a console editor. And even for a GUI editor it just slightly changes the meaning of the scroll bar: you're scrolling through content rather than visual lines. Emacs actually calculates view positions this way as well, and it seems to work fine in the GUI version. It's a very subtle difference in scroll-bar behavior in most files.

Issue B is a little more annoying, but it also only kicks in for extreme situations. And those are the same situations where you're starting to make the choice between "perfect wrapping and unusable editor" or "imperfect wrapping and usable editor". Helix could also set the chunk size far higher, so it really only kicks in when it absolutely needs to.

bestouff commented 2 years ago

Just for my information, what's the progress here ?

cessen commented 2 years ago

Realistically, at this point I doubt I'll get around to this any time soon. Most of my time and motivation is directed at other projects at the moment, and I expect that to be the case for a while.

If someone else wants to take this on, that would be great. I'd be happy to provide some guidance as time allows. Although this probably isn't something for a first-time contributor.

kirawi commented 2 years ago

I'll try to tackle this.

cessen commented 2 years ago

I'll repeat here what I said on the matrix channel:

I would recommend doing a first implementation that ignores the chunking aspect of things, since it will work fine without that for the large majority of files anyway (chunking is only needed for very long lines). And get that working first in one PR. This corresponds to point 1 in my description up-thread.

And then after that's working, go back and implement the chunking of very long lines (point 2 up-thread) in a separate PR.

antoyo commented 2 years ago

That might require adding new commands to be able to move down by display line (like gk and gj in vim) vs actual line.

thomas-profitt commented 2 years ago

Here's an example of how I think soft-wrapping should work, so wrapping doesn't break words, it's obvious that something is soft-wrapped and indentation is preserved Screenshot 2022-04-08--14:35--351

bestouff commented 2 years ago

Nice but there mau be problems with already-much-indented files, where wrapping will create a thin column stuck on the right.

cessen commented 2 years ago

@thomas-profitt I agree that soft wrapping should (optionally) preserve indentation. IMO it's really hard to read soft-wrapped source code without that feature.

@bestouff While that can happen, I still think indentation-preserving soft wrap is the better default for source code files. But certainly, it should be something that can be disabled by the user. My implementation in Led actually has two settings for soft wrap:

bool: preserve indentation on wrapping or not.
int: additional number of spaces to indent wrapped lines.

Both options can be mixed and matched. This provides a lot of flexibility in behavior for the user, and isn't especially difficult to implement.

dylrich commented 2 years ago

I think it'd be nice if there was a somewhat convenient keybind to toggle this. I personally have come to quite like hard wrapping by default, but sometimes I want to quickly soft wrap as I am writing something or if I happen to open a file I didn't expect to be a long single line. After I'm done reading/writing, I'd want to toggle it back.

vlmutolo commented 2 years ago

With #2128 (hopefully) closing soon, I'd like to start talking about possible strategies for supporting soft wrap.

The first thing I'm thinking is that "live" hard wrap (as opposed to the patch in #2128 , which is triggered by a command) and soft wrap (which is implicitly live) are kind of the same thing. This is especially true if we want to support preserving indentation and maybe even comments in soft wrap. Though, maybe these features diverge due to one actually modifying the text and the other only the viewport. But maybe those two functions can share a base implementation of "dynamic wrapping" (?).

If this is the route we want to go, we may want to investigate patching the textwrap library to support something like incremental wrapping, where instead of returning a Vec<Cow<str>>, it gives back some kind of iterator over the changed lines. Maybe an impl Iterator<Cow<str>> or something similar. It would be great if the hypothetical incremental wrap command also supported a char range where we could specify that nothing before or after that range should change.

@kirawi I just saw that you've already started on a PR for this. What do you think of this direction? I'm mostly asking about using the textwrap crate along with some kind of patch to allow for incremental wrapping. I'm not at all familiar with how the current viewport implementation works, so maybe this is a bad idea.

EDIT: Now that I think about it more, I'm not sure how well textwrap would work on code. I was mostly thinking about prose.

mgeisler commented 2 years ago

If this is the route we want to go, we may want to investigate patching the textwrap library to support something like incremental wrapping, where instead of returning a Vec<Cow<str>>, it gives back some kind of iterator over the changed lines. Maybe an impl Iterator<Cow<str>> or something similar. It would be great if the hypothetical incremental wrap command also supported a char range where we could specify that nothing before or after that range should change.

Textwrap is actually very line-oriented: that is, it wraps multiple lines by simply wrapping them one by one. This means that you as a caller can save a lot of work if you don't ask it to wrap lines which you know haven't changed.

Perhaps I misunderstood and what you are after is a way to get back the output line-by-line? So you feed a single 200 character line to Textwrap and it gives you back an iterator which will yield the 2-3 wrapped lines? I used to have such a design, but it was complicated to make all features work together... so I changed it to return a Vec instead for simplicity.

Since it's very very fast to wrap a single line of text (I measure some 40 microseconds to wrap a line with 800 characters), I figured returning the fully wrapped result would be okay.

But I would be very happy to hear feedback on this from real-world applications :smile:

cessen commented 2 years ago

Now that I think about it more, I'm not sure how well textwrap would work on code. I was mostly thinking about prose.

Yeah. I suspect the use cases here are different enough that textwrap probably doesn't make sense for Helix. The easy parts of text wrapping are... well, easy, and don't (IMO) justify a dependency. And the hard parts of text wrapping (how we handle indentation, what are considered valid break points, etc.) are also the places where we're likely to differ from textwrap anyway.

It's also worth noting that in an editor the text wrapping code isn't just for display, it's also used for cursor movement, knowing where to place inline compiler errors, and anything else that needs to query the relationship between text offsets and screen position. Those kinds of queries could potentially be built on top of something like textwrap, but it would probably involve a fair bit of shoehorning, and it's yet another thing that differs in our use case.

(This is in no way a knock against textwrap, btw. Being targeted in your use cases rather than trying to be everything to everyone often makes for better, not worse, libraries.)

vlmutolo commented 2 years ago

We already merged/released support for the "reflow" command using the textwrap crate. I think the addition of a (relatively small) dependency was well worth it in that case. The purpose of "reflow" is to take prose-like text, such as comments and markdown, and hard-wrap it to a given line width.

Textwrap does a far better job at this than what I had proposed in the PR originally. For that use case, if we wanted to get the same quality reflow as what textwrap provides, we'd basically have to re-implement textwrap.

For other use cases, like soft wrapping the displayed text, textwrap may or may not be the right fit. I'm not sure. But we already have it in the project to use if/where it makes sense.

cessen commented 2 years ago

Ah, yeah, that makes sense.

And again, I'm not knocking textwrap at all here. In fact, I was pleasantly surprised to see that e.g. it can be configured to be zero-dependency (I'm used to library crates pulling in the world, which makes me hesitant to pull them in as dependencies even if they otherwise perfectly match my use case). And the optional dependencies it does have seem carefully chosen and worth the features they enable. I think that all speaks well of the engineering sensibilities of the author(s).

I'm just skeptical if it's the right fit for soft wrapping in Helix, for the reasons I outlined above.

kirawi commented 2 years ago

I don't think it would be the best choice for soft wrapping because graphemes would be iterated over twice: once to calculate the wrapping, and again to render the text. Though that might not be avoidable either way, now that I think about it...

mgeisler commented 2 years ago

Hi @cessen, this comment became a bit of an essay... I hope it's useful still :-)

The easy parts of text wrapping are... well, easy, and don't (IMO) justify a dependency.

Yeah, I agree: the simple case is simple. When you know the parameters of your problem, and when you're happy with the normal greedy wrapping (see the documentation of wrap_optimal_fit for an example of a different wrapping algorithm), then it's easy to write the code yourself. I made a quick-and-dirty implementation here just so that I can estimate the size overhead of using Textwrap: binary-sizes/main.rs.

By parameters of the problem, I mean things like:

Can the width become less than the width of the shortest word in your paragraph? If so, do you want to break words apart or let them stick out into the margin?
Should you support wrapping at hyphens ('-')? What about --foo-bar, where are the legal breakpoints in that word?
Wrapping at soft-hypens ('\u{00AD}')? This is not supported by Textwrap, but I hope to add it one day.
Should emojis be handled? Textwrap can either use unicode-width for support for all of Unicode, or it can use it's own trivial estimation which works for emojis, but which fails for Asian characters.
Should the available breakpoints be all ' ' characters only, or do you want to use the unicode-linebreak algorithm? How do you handle multiple spaces between words?

If you fix answers to some of these questions, the problem space shrinks dramatically and you end up with less code. The Textwrap dependencies are all optional, so you can slim it down as needed.

(This is in no way a knock against textwrap, btw. Being targeted in your use cases rather than trying to be everything to everyone often makes for better, not worse, libraries.)

Thanks, I completely get it!

Textwrap tries to be pretty configurable. It started out as a ~20 line crate which implemented the simplest and most naive wrapping you can imagine. I later added options for more and more cases.

Most recently, I made Textwrap handle proportional fonts, which you can see an example of here: https://mgeisler.github.io/textwrap. This uses JavaScript to measure the sizes of each word, but uses Textwrap to wrap the words into lines. So instead of working on a &str, Textwrap works on what I call "fragments": opaque boxes which have a width followed by whitespace. The internals operate on these fragments, and then there is a layer around that which operate on text. However, the Fragment trait is exposed on purpose to allow other programs to use it directly.

To summarize, if you want to let users transform text into wrapped lines, then Textwrap ought to be useful for that. Examples could be plain text and comments with or without indentation. Textwrap will not work for wrapping code according to an AST and you would need to built on top of the Fragment trait if you want to wrap something more than a plain &str (such as styled text).

cessen commented 2 years ago

Hi @mgeisler,

Thanks for the essay! Ha ha. It's genuinely appreciated. :-) I've kind of ended up with an essay of my own below.

To answer your question about the parameters:

We want to use the simple greedy algorithm, not Knuth or similar. This isn't for performance (btw, kudos on your linear-time implementation!), but rather for UX: globally optimal solutions can cause the editing cursor to jump around unpredictably, because edits later in the paragraph can cause earlier parts to get re-wrapped differently. Knuth wrapping is great for final display of text, but not so much for editing.
Yes, we want to handle wrapping of words/segments that are longer than the wrapping line width.
Hyphens etc. aren't really an appropriate model for code. Inserting hyphenation is obviously a non-starter for this use case. And in terms of breaking on hyphens, there can be all kinds of punctuation/special characters in code, with varying meaning between programming languages. So it might be appropriate to break on hyphens in one language, but not another. For a first implementation, we'll probably punt on this and only break on white space. But we might get fancier in the future, using knowledge of language syntax.
Ideally, all unicode character widths should be handled appropriately for a monospace context. E.g. CJK will generally be double-width, etc. And, of course, grapheme clusters need to be handled correctly.
I imagine we'll handle multiple whitespace characters between words similarly to how I've implemented it before: you treat whitespace as being joined to the word that precedes it. This prevents wrapped whitespace from being at the start of lines whenever possible, and gives a clear single break point between words, which simplifies the code.

These are all things I've implemented before in a different editor project, and as long as we handle graphemes appropriately (which is already in Helix), none of the above points are IMO the hard parts of soft wrapping in an editor.

The actual hard parts come from a different set of parameters:

Since this is a code editor, we'll want to (optionally) preserve the initial indentation of a line in the subsequent soft wrapped portions of the line. Similarly, we'll want soft-wrapped portions to (optionally) have additional indentation as well.
We need to be able to map both from text offset -> screen space position and from screen space position -> text offset. The latter in particular involves knowing what screen space positions are valid (e.g. the aforementioned soft-wrapped indentation isn't real text, and therefore are not valid positions), and what "closest valid position" should mean, the tab stop width, etc. How we do this influences not just display but also editor behavior, so I suspect we'll want tight control over how this works.

Additionally, soft wrapping should be togglable, and we'll ideally want the code that handles things like character width, tab stops, text offset <-> screen space queries, etc. to be shared between wrapping and non-wrapping mode where reasonable to do so, to make it easier to keep behavior consistent. And that starts to feel a little out of place in an external text wrapping library, I think...?

I'm sure additional features could be added to textwrap to accommodate these requirements. But at a certain point, it starts to feel like we're pushing code that really belongs in Helix into textwrap just to accommodate our usage of it. And I guess, ultimately, my gut is just telling me that we're probably going to want tighter integration for soft wrapping than we're likely to get with an external library. I could be wrong, of course. But that's where I'm at, at least.

Having said all of that, aside from my maintenance of Ropey, I'm not currently an active contributor to Helix. So I guess no one should take my opinion here with too much weight, ha ha. But I am an invested user, who cares a lot about this particular feature.

mgeisler commented 2 years ago

We want to use the simple greedy algorithm, not Knuth or similar. This isn't for performance (btw, kudos on your linear-time implementation!), but rather for UX: globally optimal solutions can cause the editing cursor to jump around unpredictably,

Yeah, definitely. About the linear-time algorithm, I was as surprised as everyone else to learn that it was possible :smile: I found some Python code which I ported to Rust and it seems to work.

The actual hard parts come from a different set of parameters:

Since this is a code editor, we'll want to (optionally) preserve the initial indentation of a line in the subsequent soft wrapped portions of the line. Similarly, we'll want soft-wrapped portions to (optionally) have additional indentation as well.

This sounds like something that is outside of what Textwrap should do. Put differently, deciding on the amount of indentation to use is something I would expect the caller of Textwrap to do. So if you find that you need 12 space indentation for the first line and 16 spaces for the subsequent lines, then you can send the text to Textwrap and have it wrap with those prefixes.

In any case, I'll be happy to answer questions about what Textwrap can and cannot do — it's a very simple system at heart (as one would expect) and then it has a few layers on top to make it more flexible.

I ended up having a parallel discussion with @getreu in #2419, I hope you can all align on a good way to use Textwrap (or not) for the different parts of the editor.

aral commented 2 years ago

@mgeisler Hey Martin, I was just wondering where work on this feature’s at currently. Would be amazing to have :)

spiderman-idog commented 1 year ago

Status?

kirawi commented 1 year ago

See https://github.com/helix-editor/helix/pull/417#issuecomment-1303910195

spiderman-idog commented 1 year ago

See #417 (comment)

Thanks!

mgeisler commented 1 year ago

Hey @aral, it seems another plan has been made. I'm not directly involved with Helix development, but I'll be happy to adapt Textwrap to make it flexible enough for this use case.

kirawi commented 1 year ago

The modifications necessary to support text wrapping and virtual text are too specific to Helix, such as caching breaks. It's not a fault of textwrap.

mgeisler commented 1 year ago

Yeah, there are definitely many other factors at play here!

In particular, you would probably end up re-implementing large parts, just like I do in my Wasm demo (see https://github.com/mgeisler/textwrap/blob/master/examples/wasm/src/lib.rs). You'll be using the normal first-fit wrapping algorithm (since optimal-fit wrapping behaves funny when you use it with interactive text, see cargo run --example interactive in a Textwrap checkout) and so you can end up with simpler code by just inlining things.

Now, if you do decide to add a hard-wrap option which inserts actual \n characters in the file, then the optimal-fit wrapping could be really pretty to have. I've been using Emacs for 20 years, and I habitually press M-q (Alt-q) all the time to hard-wrap my text and comments in all sorts of files. I really ought to make that shortcut use Textwrap with the optimal-fit wrapping to see how that would look :smile:

kpa28-git commented 1 year ago

I appreciate all the people working on this. I wouldn't mind a character based unindented soft wrap (similar to what kakoune does) as a starting point.

I love helix but I have to use kakoune to edit my LaTeX files right now becuase helix doesn't soft wrap. It would be nice to have a basic toggleable soft wrap that could be replaced by an improved version in the future. I'd prefer many of the improvements suggested here, but I wouldn't mind something simple at first if it's a lot faster to release.

pascalkuthe commented 1 year ago

The rendering potion of text wrapping is implemented in #5008 (including proper handling of indentation and linear splitting at word boundaries, falling back to traditional softwrap when that is not possible). This PR only gets us part of the way there as the rest of the editor still needs to be adjusted to account for the fact a single line might take up multiple lines on screen but it does contain a big portion of the work

goyalyashpal commented 1 year ago

to edit my LaTeX files right now becuase helix doesn't soft wrap

oh lol, i was thinking same, am guilty of not hard breaking my para in LaTeX myself - probably as it made a bit harder and unclean to work with that....

But

seeing all the issues mentioned here regarding navigations etc etc...
combined with my internal will to keep line lengths in check

I have decided to:

not run after this soft wrap,
rather, opting for the "live" hard wrap[^lhw] or "automatic reflow" dynamically/in realtime i.e. while typing.

[^lhw]: @ vlmutolo at https://github.com/helix-editor/helix/issues/136#issuecomment-1109218991

In my own words:

what i meant by above "dynamic reflow" is that

how about automatically breaking and conjoining lines adhering to some specified character limit per line? (like say 73)

fantasizingly: this can be made non-constant to get some pretty dashing ASCII art flowing inside some particular shape like in inkscape. wow.

pascalkuthe commented 1 year ago

Softwrap is already implemented in #5420 and works quite well. I encourage you to try it out. Continuous hardwrap can be implemented based on the work I already did in that PR once it lands.

goyalyashpal commented 1 year ago

self hiding as offtopic

118 hidden items Load more...

omg

goyalyashpal commented 1 year ago

Here is my word-wrap settings from notepad3.

The main take away is possibility to "show near borders" and "before wrap" for the visual indicator.
I do Not use the visual indicator "near text" and "after wrap" which is what i see in the screenshots in the pr above[^ss].
the settings are explained in following comment: https://github.com/helix-editor/helix/issues/136#issuecomment-1399351899

[^ss]: screenshot at !5420 (comment)

Rewording in text here:

Wrap Indent
    ( ) no
    ( ) by 1 character
    ( ) by 2 characters
    ( ) by 1 level
    ( ) by 2 levels
    (x) as first subline
    ( ) by 1 level more than first subline

Visual indicators before wrap
    ( ) no
    ( ) show near text
    (x) show near borders

Visual indicators after wrap
    (x) no
    ( ) show near text
    ( ) show near borders

Wrap text between
    (x) words
    ( ) any glyphs

kirawi commented 1 year ago

You can disable visual indicators: editor.soft-wrap.wrap-indicator = ""

goyalyashpal commented 1 year ago

You can disable visual indicators: editor.soft-wrap.wrap-indicator = ""

it is not about disabling those, it's about where they appear. I didn't have a test sample for the text, otherwise would have shared the screenshots quite quickly.

goyalyashpal commented 1 year ago

the "show before wrap" means it's shown before breaking up, so, in the end in the previous visual line
the "show near borders" mean that it's near borders of the frame, rather than clinging to the text

Combined with "show blanks" (equivalent of whitespace.render = "all") the "wrap as first subline" looks quite beautiful and distinct.

pascalkuthe commented 1 year ago

* the "show before wrap" means it's shown before breaking up, so, in the end in the previous visual line

* the "show near borders" mean that it's near borders of the frame, rather than clinging to the text
Combined with "show blanks" (equivalent of `whitespace.render = "all") the "wrap as first subline" looks quite beautiful and distinct.

That could be added in the future and shouln't be too hard to implement. The new positoning/rendering code implemented there is quite flexible. That being said I am a bit hesitant to keep piling new features onto #5420 as it's already a huge PR that is hard to review and will cause breaking changes in the codebase. That kind of feature would be better in a followup PR

goyalyashpal commented 1 year ago

as it's already a huge PR that is hard to review and will cause breaking changes in the codebase. That kind of feature would be better in a followup PR

yeah, that's good. as i said somewhere else before too, my intent is not that this to be done in this xyz pr; rather how it should eventually end up. That's one more reason why i did not put these suggestions in that pr too.

a small nitpick, can you please hide the preview (by removing the preceeding exclamation) of the attached image from your quote? to keep things a bit tidy :smiley: 😇

mgeisler commented 1 year ago

fantasizingly: this can be made non-constant to get some pretty dashing ASCII art flowing inside some particular shape like in inkscape. wow.

Just as an aside: Textwrap takes a list of widths when wrapping text. This allows you to do things like cut out space for figures, but you also go further and wrap text inside circles, triangles and so on. I don't think it's very well known since i haven't created any demos with this yet :slightly_smiling_face:

goyalyashpal commented 1 year ago

self hiding this as offtopic

Textwrap takes a list of widths when wrapping text. ... go further and wrap text inside circles ...

wowww, that's super nice. Exactly what i was thinking. To avoid OT here, have opened a Quick'n'dirty issue on textwrap repo: https://github.com/mgeisler/textwrap/issues/499

pascalkuthe commented 1 year ago

This didn't get tagged appropriately but with #5420 merged a capable soft wrap implementation is now available in master. Any further specific improvements on top of that should be posted as separate issues.

kpa28-git commented 1 year ago

@goyalyashpal I got distracted with some other stuff, you are right in many cases. I should probably use hard wrapping more for stylistic reasons. There is another need for it though. I like to experiment with large text sizes sometimes (I admit influenced somewhat by R). This means lines can get often cut off even if line lengths are set to something reasonable like 80 or 100. For some people with disabilities, softwrap may be a necessity.

I found out about the softwrap toggle today. It feels good to finally not need kakoune anymore! The contributors did a great job!

deg4uss3r commented 8 months ago

This is usually the first result I get when I forget where it is or how to use this setting. Leaving a breadcrumb here to the official PR that merged it: https://github.com/helix-editor/helix/pull/5420#issuecomment-1372961649

As well as the place in the docs

webdev23 commented 8 months ago

We need a command within Helix to switch soft-wrap during a running session. There is many situations where it is handy to disable. (Example: .csv)

Since the following works flawlessly:

Enable Soft-wrap:

sed -i '/^\[editor.soft-wrap\]/,/^$/ s/$enable = $.*/\1true/' ~/.config/helix/config.toml && pkill -USR1 hx

Disable:

sed -i '/^\[editor.soft-wrap\]/,/^$/ s/$enable = $.*/\1false/' ~/.config/helix/config.toml && pkill -USR1 hx

That should be straightforward to implement, because as shown, it is is a bit dangerous to blindly modify our user config, and that does affect all running instances.

Also, wrapping such kinds of commands to a key into the config.toml makes the config file very hard to read. TOML has some good properties, but that may lead to many errors when trying to edit it from scripts. It is made to be used with a toml library. That disallow all kinds of wizardry from the command line. In the same situation, a config.json would be way better, as a base of an Helix API.

kirawi commented 8 months ago

You can do :toggle soft-wrap.enable. TOML was also always a stopgap solution until the plugin system lands.

kellytk commented 8 months ago

@kirawi Is it possible to bind that to a single key?

webdev23 commented 8 months ago

Oh wow, many thanks, I really missed that in the documentation. That is useful.

That mean we could simply do this (works alright!)

[keys.normal.space]
W = [":toggle soft-wrap.enable", ":redraw"]

hongquan commented 2 months ago

Wish that it is bound with some key in view mode (z).

helix-editor / helix

Support soft wrap #136