nushell / reedline

A feature-rich line editor - powering Nushell
https://docs.rs/reedline/
MIT License
547 stars 151 forks source link

Autocompletion don't take character width into account when wrapping long paths #572

Open JonathanxD opened 1 year ago

JonathanxD commented 1 year ago

Describe the bug

I found this problem while trying to complete some commands while on my home folder, and by whatever reason Nushell kept adding a new line before every option as shown in the screenshot below

image

The reason was a character that doesn't have the same width as a regular ASCII character, so even though it does fit in the terminal, Nushell thinks it doesn't:

image

It seems that Nushell relies on the number of bytes instead of the character width, and this causes this kind of problem. The worse thing is that whenever I press TAB to cycle between the options, the terminal duplicates its "prompt", as shown in the screenshot below:

image

I also recorded a clip to show this in practice:

https://user-images.githubusercontent.com/5360060/232176827-cc683bc1-55be-43bf-82ff-aed9935c5fbc.mp4

Note: while I was writing this bug report, I was able to make nushell just duplicate the previous buffer text after every tab press:

https://user-images.githubusercontent.com/5360060/232177343-876ce903-5e36-44bd-8e19-ba78d6db05c2.mp4

Note2: The new line pollution is even worse if the terminal is very thin:

image

And even though it seems that Nushell is the one wrapping the text, it's not, actually it's filling every path output with empty spaces after the name, and the Terminal Emulator (kitty in this case) is wrapping those blank spaces.

image

Yes, no one should be using a terminal with that dimension, but this is a good way to figure out the pattern of the behavior.

How to reproduce

For the wrong wrapping of variable width characters:

  1. Create some files with 2, 3 or 4-byte wide characters (4-byte wide is easier) that are visually less wide than regular ASCII characters (half-width characters for example). 1.1. The total number of bytes in the file name must be greater than the terminal columns number/width (term size) at the same time it must not fill the entire line. 1.2. You can resize the terminal emulator to reproduce it easily (with less characters as well).
  2. Try to autocomplete the path, it will wrap regardless of the path not filling the entire terminal line and will also pollute the output visually with line breaks (fish handles this using ... instead of wrapping to the next line).

For the "output" duplication:

  1. Create some files (at least one with a long name that doesn't fit the terminal)
  2. Press enter until the terminal scrolls
  3. Try to autocomplete a path, it will duplicate the output right above (if you do exactly the same way I did, it will even duplicate the previous autocompletion output)

Expected behavior

It should only wrap if the path does not fill the entire line. I've tested textwrap with those scenarios and it seems to handle all of them correctly (wrapping only when they don't visually fit the terminal). For the new line pollution, I think it should not be filling the path with blank space if it's going to wrap anyway.

As for the duplication, it should just not happen, I didn't figured out why it's doing that.

Screenshots

I've attached to the bug description, so they are all within context.

Configuration

key value
version 0.78.0
branch
commit_hash
build_os linux-x86_64
build_target x86_64-unknown-linux-gnu
rust_version rustc 1.70.0-nightly (696aaad58 2023-04-09)
rust_channel nightly-x86_64-unknown-linux-gnu
cargo_version cargo 1.70.0-nightly (0e474cfd7 2023-03-31)
build_time 2023-04-11 10:27:56 -03:00
build_rust_channel release
features default, zip
installed_plugins from parquet, hist, periodic-table, plot, regex, xyplot

I've also the build distributed in Arch Linux Official repos, it does have the exact same problem:

key value
version 0.78.0
branch makepkg
commit_hash 2ec20286371f1d90a77fdd5d82818c6c5851f090
build_os linux-x86_64
build_target x86_64-unknown-linux-gnu
rust_version rustc 1.68.2 (9eb3afe9e 2023-03-27) (Arch Linux rust 1:1.68.2-1)
cargo_version cargo 1.68.2
build_time 2023-04-05 02:09:35 +00:00
build_rust_channel release
features default, zip
installed_plugins from parquet, hist, periodic-table, plot, regex, xyplot

Additional context

No response

sholderbach commented 1 year ago

Thanks for the detailed report and investigation!

I am gonna move this bug over to reedline because I think the wrapping logic should be part of the core implementation of the Menu system. I think your suspicion is correct that we incorrectly use the .len() here. Getting the the printed length on the terminal is non-trivial as it can depend on the available fonts and the implementation of the terminal emulator. So all tools that try to do wrapping will either make an educated guess or have to probe the cursor position at some point which is highly costly.

For redraws reedline's painter needs to know the screen position where it started drawing the prompt. If we have a wrapping bug here that can scroll the terminal unexpectedly due to extra lines that might break this assumption and the prompt is redrawn in the wrong position. So that seems to be a secondary symptom.

JonathanxD commented 1 year ago

Getting the the printed length on the terminal is non-trivial as it can depend on the available fonts and the implementation of the terminal emulator. So all tools that try to do wrapping will either make an educated guess or have to probe the cursor position at some point which is highly costly.

That's what I thought initially, but after a long research that's my conclusion: Terminal emulators normally only allow the use of monospaced fonts, which means the character either fits a single cell (half-width) or two cells (full-width). Just to give some example:

Kitty

only allows monospaced fonts.

Wezterm

prefers monospaced fonts (falling back to non-monospaced/proportional fonts).

Also provides an option to control the cell_width and font stretching.

Konsole

Prefers monospaced fonts and requires to explicitly enable proportional fonts before even being able to select one.

Alacritty

I cannot find too much in its documentation, but I can find that it uses crossfont which also have a limited support for proportional fonts (non-monospaced).

It seems that one can control the rendering behavior in case of issues with the font presentation.

Conclusion

As far as I've researched, Terminal Emulators only fully supports monospaced fonts and everything else related to "what's the size/dimension of a character" is up to the Terminal emulator to decide or let the user inform (like Alacritty and Wezterm allows you to do so).

So I think the shell should not care about the font, just rely on the well defined specification of Unicode Standard Annex #11 (there's a crate for this already) to resolve the character dichotomy (fullwidth/halfwidth) and defined whether it fits in a single terminal cell or two cells (half and full width respectively), given that the terminal emulator is the one responsible for ensuring that the characters fits the cells accordingly.

Just an additional note: Apart from what the documentation says about font rendering, I've also empirically validated this against the listed Terminal Emulators, all of them handles every Monospaced font accordingly (either one or two cells). Proportional fonts, on the other hand, in fact varies, but once adjusted (on the terminal that supports this) they behave as expected with some exceptions, but those terminals never really fully supported those ones.

I think it's safe to work with those assumptions, but I may be missing some context as well.

sholderbach commented 1 year ago

Sadly UAX#11 is more of a suggestion in my experience. This can be down to how a symbol is rendered or if codepoints are joined as a grapheme. (see for example this hack in rusts error message output to remove the zero-width-joiner symbol)

A while back I tried to test how faithful the widths in unicode-width are reproduced on different terminals: https://github.com/sholderbach/unicode_measurements

It is great for the ASCII adjacent stuff (bar \t) but things fall apart with symbols or non-latin scripts quite quickly.

JonathanxD commented 1 year ago

Yes, you're absolutely right.

I've tested your project against the terminals I've listed and that's the results:

Another thing that I've noticed is the considerable difference between WezTerm and Kitty specially with this one: ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ

WezTerm:

image

Kitty:

image

That's not exclusive to nushell, it happens with fish and zsh as well, so it's a kitty bug.

So yeah, it seems very very hard to deal with this and I think the best approach is to just deal with the additional spaces issue and not bother that much with the width. In regards to the additional spaces, if I resize the terminal to fit two screens horizontally (448 columns), nushell behaves similarly to zsh, so I think that fixing this problem should be enough.