Closed kellytk closed 1 year ago
I'd tentatively like to take this on, based on my experience with Led. But in case anyone gets to it before me, I want to leave some notes about Led's approach.
Here's a short demo of Led in the face of large files and crazy-long lines:
http://perm.cessen.com/2021/helix/led_demo.mp4
The core principle that enables this is to always do things in ways that keep calculations local to a given area of the text. The specific techniques I've used in Led are:
char
-offset into the text, not by visual line. This lets the display code jump directly to the vicinity of the content that should be on screen, without needing any information about how it will be displayed. The wrapping etc. can then be done locally afterwards, in that vicinity, and everything positioned on screen based on that. This is efficient enough to be done on-the-fly every time it's needed (including for e.g. vertical cursor movement calculations), with no caching needed. However, with this alone it still needs to calculate soft-wrapping starting at the beginning of the line that contains the content, which obviously won't work in real time for very long lines. And that brings us to...The down sides to this approach are that A. the editor has no concept of absolute visual vertical position for use with e.g. scroll bars, and B. there are periodic soft line breaks at the chunk boundaries of over-long lines.
I don't think issue A is a problem for a console editor. And even for a GUI editor it just slightly changes the meaning of the scroll bar: you're scrolling through content rather than visual lines. Emacs actually calculates view positions this way as well, and it seems to work fine in the GUI version. It's a very subtle difference in scroll-bar behavior in most files.
Issue B is a little more annoying, but it also only kicks in for extreme situations. And those are the same situations where you're starting to make the choice between "perfect wrapping and unusable editor" or "imperfect wrapping and usable editor". Helix could also set the chunk size far higher, so it really only kicks in when it absolutely needs to.
Just for my information, what's the progress here ?
Realistically, at this point I doubt I'll get around to this any time soon. Most of my time and motivation is directed at other projects at the moment, and I expect that to be the case for a while.
If someone else wants to take this on, that would be great. I'd be happy to provide some guidance as time allows. Although this probably isn't something for a first-time contributor.
I'll try to tackle this.
I'll repeat here what I said on the matrix channel:
I would recommend doing a first implementation that ignores the chunking aspect of things, since it will work fine without that for the large majority of files anyway (chunking is only needed for very long lines). And get that working first in one PR. This corresponds to point 1 in my description up-thread.
And then after that's working, go back and implement the chunking of very long lines (point 2 up-thread) in a separate PR.
That might require adding new commands to be able to move down by display line (like gk
and gj
in vim) vs actual line.
Here's an example of how I think soft-wrapping should work, so wrapping doesn't break words, it's obvious that something is soft-wrapped and indentation is preserved
Nice but there mau be problems with already-much-indented files, where wrapping will create a thin column stuck on the right.
@thomas-profitt I agree that soft wrapping should (optionally) preserve indentation. IMO it's really hard to read soft-wrapped source code without that feature.
@bestouff While that can happen, I still think indentation-preserving soft wrap is the better default for source code files. But certainly, it should be something that can be disabled by the user. My implementation in Led actually has two settings for soft wrap:
Both options can be mixed and matched. This provides a lot of flexibility in behavior for the user, and isn't especially difficult to implement.
I think it'd be nice if there was a somewhat convenient keybind to toggle this. I personally have come to quite like hard wrapping by default, but sometimes I want to quickly soft wrap as I am writing something or if I happen to open a file I didn't expect to be a long single line. After I'm done reading/writing, I'd want to toggle it back.
With #2128 (hopefully) closing soon, I'd like to start talking about possible strategies for supporting soft wrap.
The first thing I'm thinking is that "live" hard wrap (as opposed to the patch in #2128 , which is triggered by a command) and soft wrap (which is implicitly live) are kind of the same thing. This is especially true if we want to support preserving indentation and maybe even comments in soft wrap. Though, maybe these features diverge due to one actually modifying the text and the other only the viewport. But maybe those two functions can share a base implementation of "dynamic wrapping" (?).
If this is the route we want to go, we may want to investigate patching the textwrap
library to support something like incremental wrapping, where instead of returning a Vec<Cow<str>>
, it gives back some kind of iterator over the changed lines. Maybe an impl Iterator<Cow<str>>
or something similar. It would be great if the hypothetical incremental wrap command also supported a char range where we could specify that nothing before or after that range should change.
@kirawi I just saw that you've already started on a PR for this. What do you think of this direction? I'm mostly asking about using the textwrap
crate along with some kind of patch to allow for incremental wrapping. I'm not at all familiar with how the current viewport implementation works, so maybe this is a bad idea.
EDIT: Now that I think about it more, I'm not sure how well textwrap
would work on code. I was mostly thinking about prose.
If this is the route we want to go, we may want to investigate patching the
textwrap
library to support something like incremental wrapping, where instead of returning aVec<Cow<str>>
, it gives back some kind of iterator over the changed lines. Maybe animpl Iterator<Cow<str>>
or something similar. It would be great if the hypothetical incremental wrap command also supported a char range where we could specify that nothing before or after that range should change.
Textwrap is actually very line-oriented: that is, it wraps multiple lines by simply wrapping them one by one. This means that you as a caller can save a lot of work if you don't ask it to wrap lines which you know haven't changed.
Perhaps I misunderstood and what you are after is a way to get back the output line-by-line? So you feed a single 200 character line to Textwrap and it gives you back an iterator which will yield the 2-3 wrapped lines? I used to have such a design, but it was complicated to make all features work together... so I changed it to return a Vec
instead for simplicity.
Since it's very very fast to wrap a single line of text (I measure some 40 microseconds to wrap a line with 800 characters), I figured returning the fully wrapped result would be okay.
But I would be very happy to hear feedback on this from real-world applications :smile:
Now that I think about it more, I'm not sure how well textwrap would work on code. I was mostly thinking about prose.
Yeah. I suspect the use cases here are different enough that textwrap
probably doesn't make sense for Helix. The easy parts of text wrapping are... well, easy, and don't (IMO) justify a dependency. And the hard parts of text wrapping (how we handle indentation, what are considered valid break points, etc.) are also the places where we're likely to differ from textwrap
anyway.
It's also worth noting that in an editor the text wrapping code isn't just for display, it's also used for cursor movement, knowing where to place inline compiler errors, and anything else that needs to query the relationship between text offsets and screen position. Those kinds of queries could potentially be built on top of something like textwrap
, but it would probably involve a fair bit of shoehorning, and it's yet another thing that differs in our use case.
(This is in no way a knock against textwrap
, btw. Being targeted in your use cases rather than trying to be everything to everyone often makes for better, not worse, libraries.)
We already merged/released support for the "reflow" command using the textwrap crate. I think the addition of a (relatively small) dependency was well worth it in that case. The purpose of "reflow" is to take prose-like text, such as comments and markdown, and hard-wrap it to a given line width.
Textwrap does a far better job at this than what I had proposed in the PR originally. For that use case, if we wanted to get the same quality reflow as what textwrap provides, we'd basically have to re-implement textwrap.
For other use cases, like soft wrapping the displayed text, textwrap may or may not be the right fit. I'm not sure. But we already have it in the project to use if/where it makes sense.
Ah, yeah, that makes sense.
And again, I'm not knocking textwrap
at all here. In fact, I was pleasantly surprised to see that e.g. it can be configured to be zero-dependency (I'm used to library crates pulling in the world, which makes me hesitant to pull them in as dependencies even if they otherwise perfectly match my use case). And the optional dependencies it does have seem carefully chosen and worth the features they enable. I think that all speaks well of the engineering sensibilities of the author(s).
I'm just skeptical if it's the right fit for soft wrapping in Helix, for the reasons I outlined above.
I don't think it would be the best choice for soft wrapping because graphemes would be iterated over twice: once to calculate the wrapping, and again to render the text. Though that might not be avoidable either way, now that I think about it...
Hi @cessen, this comment became a bit of an essay... I hope it's useful still :-)
The easy parts of text wrapping are... well, easy, and don't (IMO) justify a dependency.
Yeah, I agree: the simple case is simple. When you know the parameters of your problem, and when you're happy with the normal greedy wrapping (see the documentation of wrap_optimal_fit
for an example of a different wrapping algorithm), then it's easy to write the code yourself. I made a quick-and-dirty implementation here just so that I can estimate the size overhead of using Textwrap: binary-sizes/main.rs
.
By parameters of the problem, I mean things like:
'-'
)? What about --foo-bar
, where are the legal breakpoints in that word?'\u{00AD}'
)? This is not supported by Textwrap, but I hope to add it one day.' '
characters only, or do you want to use the unicode-linebreak algorithm? How do you handle multiple spaces between words?If you fix answers to some of these questions, the problem space shrinks dramatically and you end up with less code. The Textwrap dependencies are all optional, so you can slim it down as needed.
(This is in no way a knock against
textwrap
, btw. Being targeted in your use cases rather than trying to be everything to everyone often makes for better, not worse, libraries.)
Thanks, I completely get it!
Textwrap tries to be pretty configurable. It started out as a ~20 line crate which implemented the simplest and most naive wrapping you can imagine. I later added options for more and more cases.
Most recently, I made Textwrap handle proportional fonts, which you can see an example of here: https://mgeisler.github.io/textwrap. This uses JavaScript to measure the sizes of each word, but uses Textwrap to wrap the words into lines. So instead of working on a &str
, Textwrap works on what I call "fragments": opaque boxes which have a width followed by whitespace. The internals operate on these fragments, and then there is a layer around that which operate on text. However, the Fragment
trait is exposed on purpose to allow other programs to use it directly.
To summarize, if you want to let users transform text into wrapped lines, then Textwrap ought to be useful for that. Examples could be plain text and comments with or without indentation. Textwrap will not work for wrapping code according to an AST and you would need to built on top of the Fragment
trait if you want to wrap something more than a plain &str
(such as styled text).
Hi @mgeisler,
Thanks for the essay! Ha ha. It's genuinely appreciated. :-) I've kind of ended up with an essay of my own below.
To answer your question about the parameters:
These are all things I've implemented before in a different editor project, and as long as we handle graphemes appropriately (which is already in Helix), none of the above points are IMO the hard parts of soft wrapping in an editor.
The actual hard parts come from a different set of parameters:
Additionally, soft wrapping should be togglable, and we'll ideally want the code that handles things like character width, tab stops, text offset <-> screen space queries, etc. to be shared between wrapping and non-wrapping mode where reasonable to do so, to make it easier to keep behavior consistent. And that starts to feel a little out of place in an external text wrapping library, I think...?
I'm sure additional features could be added to textwrap
to accommodate these requirements. But at a certain point, it starts to feel like we're pushing code that really belongs in Helix into textwrap
just to accommodate our usage of it. And I guess, ultimately, my gut is just telling me that we're probably going to want tighter integration for soft wrapping than we're likely to get with an external library. I could be wrong, of course. But that's where I'm at, at least.
Having said all of that, aside from my maintenance of Ropey, I'm not currently an active contributor to Helix. So I guess no one should take my opinion here with too much weight, ha ha. But I am an invested user, who cares a lot about this particular feature.
- We want to use the simple greedy algorithm, not Knuth or similar. This isn't for performance (btw, kudos on your linear-time implementation!), but rather for UX: globally optimal solutions can cause the editing cursor to jump around unpredictably,
Yeah, definitely. About the linear-time algorithm, I was as surprised as everyone else to learn that it was possible :smile: I found some Python code which I ported to Rust and it seems to work.
The actual hard parts come from a different set of parameters:
- Since this is a code editor, we'll want to (optionally) preserve the initial indentation of a line in the subsequent soft wrapped portions of the line. Similarly, we'll want soft-wrapped portions to (optionally) have additional indentation as well.
This sounds like something that is outside of what Textwrap should do. Put differently, deciding on the amount of indentation to use is something I would expect the caller of Textwrap to do. So if you find that you need 12 space indentation for the first line and 16 spaces for the subsequent lines, then you can send the text to Textwrap and have it wrap with those prefixes.
In any case, I'll be happy to answer questions about what Textwrap can and cannot do — it's a very simple system at heart (as one would expect) and then it has a few layers on top to make it more flexible.
I ended up having a parallel discussion with @getreu in #2419, I hope you can all align on a good way to use Textwrap (or not) for the different parts of the editor.
@mgeisler Hey Martin, I was just wondering where work on this feature’s at currently. Would be amazing to have :)
Status?
See #417 (comment)
Thanks!
Hey @aral, it seems another plan has been made. I'm not directly involved with Helix development, but I'll be happy to adapt Textwrap to make it flexible enough for this use case.
The modifications necessary to support text wrapping and virtual text are too specific to Helix, such as caching breaks. It's not a fault of textwrap.
Yeah, there are definitely many other factors at play here!
In particular, you would probably end up re-implementing large parts, just like I do in my Wasm demo
(see https://github.com/mgeisler/textwrap/blob/master/examples/wasm/src/lib.rs). You'll be using the normal first-fit wrapping algorithm (since optimal-fit wrapping behaves funny when you use it with interactive text, see cargo run --example interactive
in a Textwrap checkout) and so you can end up with simpler code by just inlining things.
Now, if you do decide to add a hard-wrap option which inserts actual \n
characters in the file, then the optimal-fit wrapping could be really pretty to have. I've been using Emacs for 20 years, and I habitually press M-q (Alt-q) all the time to hard-wrap my text and comments in all sorts of files. I really ought to make that shortcut use Textwrap with the optimal-fit wrapping to see how that would look :smile:
I appreciate all the people working on this. I wouldn't mind a character based unindented soft wrap (similar to what kakoune does) as a starting point.
I love helix but I have to use kakoune to edit my LaTeX files right now becuase helix doesn't soft wrap. It would be nice to have a basic toggleable soft wrap that could be replaced by an improved version in the future. I'd prefer many of the improvements suggested here, but I wouldn't mind something simple at first if it's a lot faster to release.
The rendering potion of text wrapping is implemented in #5008 (including proper handling of indentation and linear splitting at word boundaries, falling back to traditional softwrap when that is not possible). This PR only gets us part of the way there as the rest of the editor still needs to be adjusted to account for the fact a single line might take up multiple lines on screen but it does contain a big portion of the work
to edit my LaTeX files right now becuase helix doesn't soft wrap
oh lol, i was thinking same, am guilty of not hard breaking my para in LaTeX myself - probably as it made a bit harder and unclean to work with that....
But
I have decided to:
"live" hard wrap
[^lhw] or "automatic reflow" dynamically/in realtime i.e. while typing. [^lhw]: @ vlmutolo at https://github.com/helix-editor/helix/issues/136#issuecomment-1109218991
In my own words:
- what i meant by above "dynamic reflow" is that
- how about automatically breaking and conjoining lines adhering to some specified character limit per line? (like say 73)
- fantasizingly: this can be made non-constant to get some pretty dashing ASCII art flowing inside some particular shape like in inkscape. wow.
Softwrap is already implemented in #5420 and works quite well. I encourage you to try it out. Continuous hardwrap can be implemented based on the work I already did in that PR once it lands.
self hiding as offtopic
118 hidden items Load more...
omg
Here is my word-wrap settings from notepad3.
[^ss]: screenshot at !5420 (comment)
Rewording in text here:
Wrap Indent
( ) no
( ) by 1 character
( ) by 2 characters
( ) by 1 level
( ) by 2 levels
(x) as first subline
( ) by 1 level more than first subline
Visual indicators before wrap
( ) no
( ) show near text
(x) show near borders
Visual indicators after wrap
(x) no
( ) show near text
( ) show near borders
Wrap text between
(x) words
( ) any glyphs
You can disable visual indicators: editor.soft-wrap.wrap-indicator = ""
You can disable visual indicators:
editor.soft-wrap.wrap-indicator = ""
it is not about disabling those, it's about where they appear. I didn't have a test sample for the text, otherwise would have shared the screenshots quite quickly.
Combined with "show blanks" (equivalent of whitespace.render = "all") the "wrap as first subline" looks quite beautiful and distinct.
* the "show before wrap" means it's shown before breaking up, so, in the end in the previous visual line * the "show near borders" mean that it's near borders of the frame, rather than clinging to the text
Combined with "show blanks" (equivalent of `whitespace.render = "all") the "wrap as first subline" looks quite beautiful and distinct.
That could be added in the future and shouln't be too hard to implement. The new positoning/rendering code implemented there is quite flexible. That being said I am a bit hesitant to keep piling new features onto #5420 as it's already a huge PR that is hard to review and will cause breaking changes in the codebase. That kind of feature would be better in a followup PR
as it's already a huge PR that is hard to review and will cause breaking changes in the codebase. That kind of feature would be better in a followup PR
yeah, that's good. as i said somewhere else before too, my intent is not that this to be done in this xyz pr; rather how it should eventually end up. That's one more reason why i did not put these suggestions in that pr too.
a small nitpick, can you please hide the preview (by removing the preceeding exclamation) of the attached image from your quote? to keep things a bit tidy :smiley: 😇
- fantasizingly: this can be made non-constant to get some pretty dashing ASCII art flowing inside some particular shape like in inkscape. wow.
Just as an aside: Textwrap takes a list of widths when wrapping text. This allows you to do things like cut out space for figures, but you also go further and wrap text inside circles, triangles and so on. I don't think it's very well known since i haven't created any demos with this yet :slightly_smiling_face:
self hiding this as offtopic
Textwrap takes a list of widths when wrapping text. ... go further and wrap text inside circles ...
wowww, that's super nice. Exactly what i was thinking. To avoid OT here, have opened a Quick'n'dirty issue on textwrap repo: https://github.com/mgeisler/textwrap/issues/499
This didn't get tagged appropriately but with #5420 merged a capable soft wrap implementation is now available in master. Any further specific improvements on top of that should be posted as separate issues.
@goyalyashpal I got distracted with some other stuff, you are right in many cases. I should probably use hard wrapping more for stylistic reasons. There is another need for it though. I like to experiment with large text sizes sometimes (I admit influenced somewhat by R). This means lines can get often cut off even if line lengths are set to something reasonable like 80 or 100. For some people with disabilities, softwrap may be a necessity.
I found out about the softwrap toggle today. It feels good to finally not need kakoune anymore! The contributors did a great job!
This is usually the first result I get when I forget where it is or how to use this setting. Leaving a breadcrumb here to the official PR that merged it: https://github.com/helix-editor/helix/pull/5420#issuecomment-1372961649
As well as the place in the docs
We need a command within Helix to switch soft-wrap during a running session. There is many situations where it is handy to disable. (Example: .csv)
Since the following works flawlessly:
Enable Soft-wrap:
sed -i '/^\[editor.soft-wrap\]/,/^$/ s/\(enable = \).*/\1true/' ~/.config/helix/config.toml && pkill -USR1 hx
Disable:
sed -i '/^\[editor.soft-wrap\]/,/^$/ s/\(enable = \).*/\1false/' ~/.config/helix/config.toml && pkill -USR1 hx
That should be straightforward to implement, because as shown, it is is a bit dangerous to blindly modify our user config, and that does affect all running instances.
Also, wrapping such kinds of commands to a key into the config.toml
makes the config file very hard to read.
TOML has some good properties, but that may lead to many errors when trying to edit it from scripts. It is made to be used with a toml library. That disallow all kinds of wizardry from the command line. In the same situation, a config.json would be way better, as a base of an Helix API.
You can do :toggle soft-wrap.enable
. TOML was also always a stopgap solution until the plugin system lands.
@kirawi Is it possible to bind that to a single key?
Oh wow, many thanks, I really missed that in the documentation. That is useful.
That mean we could simply do this (works alright!)
[keys.normal.space]
W = [":toggle soft-wrap.enable", ":redraw"]
Wish that it is bound with some key in view mode (z
).
As discussed on Matrix.