console-rs / indicatif

A command line progress reporting library for Rust
MIT License
4.45k stars 243 forks source link

Using ⚙️ emoji in ProgressBar spinner message causes unnessecary line breaks #638

Closed TapGhoul closed 6 months ago

TapGhoul commented 7 months ago

For some reason, the width of ⚙️ (a 2-char emoji) seems to have its length mis-calculated, and as a result puts every single refresh on a new line. As seen in the Dioxus CLI when watching a build. Reproduced in both Konsole 24.02.1 and JetBrains RustRover 2023.3 EAP on Linux with both the Fira Code and Hack fonts.

Note: this only happens with ⚙️, not with ⚙.

Minimal reproduction code:

[dependencies]
indicatif = "0.17.8"
use std::time::Duration;
use indicatif::ProgressBar;

fn main() {
        let mut bar = ProgressBar::new_spinner();
        bar.enable_steady_tick(Duration::from_millis(200));
        bar.set_message("⚙️ Hi there!");
        std::thread::sleep(Duration::from_secs(5));
}
djc commented 7 months ago

You probably just need to enable some of the Cargo features for proper Unicode support?

TapGhoul commented 7 months ago

Gave that a shot - same bug. Seems to happen regardless of if improved_unicode is enabled or not.

djc commented 7 months ago

Well, maybe you can have a look at how indicatif::style::measure() fails with your example string, and if there's something that can be done to improve it?

TapGhoul commented 7 months ago

@djc The issue seems to be related to https://github.com/unicode-rs/unicode-width/issues/27 - the 0xfe0f unicode selector character turns the cog from a 1 character to a 2 character wide symbol, but it's not a universal width increase. Not really sure what to do here, but it seems like a difficult spot to be.

Additionally, in the ratatui issue that links to it, the crate https://github.com/jameslanska/unicode-display-width/ is mentioned as a possible replacement. I haven't tested yet, but it'd need to be moved into the console crate instead.

djc commented 7 months ago

Additionally, in the ratatui issue that links to it, the crate https://github.com/jameslanska/unicode-display-width/ is mentioned as a possible replacement. I haven't tested yet, but it'd need to be moved into the console crate instead.

Why would it need to be moved into the console crate? I'd be okay with taking on the extra dependency on this (only if improved_unicode is enabled).

TapGhoul commented 7 months ago

Could do, just feels weird to me to use the console crate for so few parts. But it'd just be a matter of replacing all occurrences of console::measure_text_width() with the width() method from the mentioned crate, maybe putting console::ansi::strip_ansi_codes() in before it depending on if you wanted that feature flag - I'm not sure what your intent is with dealing with ANSI there (though I'd assume you would want to strip it)

djc commented 7 months ago

It's separation of concerns -- Unicode and dealing with the console are really separate domains so using different dependencies for these makes some amount of sense to me.

TapGhoul commented 7 months ago

Works for me then. I will try to find some time in the next couple days to whip up an MR.

TapGhoul commented 7 months ago

Actually hol up - while it is in theory separation of concerns, console-rs already has unicode width, which is why I flagged up maybe doing it in console anyway. We 100% sure we don't want to put it over in console-rs's own code? I feel like the fact that Cargo.toml has console/unicode-width we should be fixing it there.

(Hi I'm still waking up, so I'm a bit slower to catch things like this :P)

djc commented 7 months ago

That's fair! Would be nice but console has had pretty slow maintenance recently. Okay, submit a PR to it and please ping me on it so I can have a look.

TapGhoul commented 7 months ago

-snip- I misread one of the internals - actually, my concerns are invalid. 2 PRs, coming up!

TapGhoul commented 7 months ago

I have created a pair of PRs that fully replace the API. Worth a note: there's a cast from a u64 to a usize in there, done to avoid changing APIs too much - also mentioned at https://github.com/console-rs/console/pull/210/files#r1552479044

TapGhoul commented 7 months ago

Continuing the conversation from the PRs here because this is more meta to the whole change. Worth a note in the original unicode-width crate:

Most shells including bash and zsh and most terminal emulators do not support the zero width joiner.

If your application needs the width as rendered by any of these systems, DO NOT use this crate. This project is more suited towards editors that wrap a web engine such as Chromium with Electron (e.g. VS Code).

Given this info, I'm not really sure what we want to be doing here at this point. They suggest one solution involving probing the terminal for information on how it prints things using cursor widths etc, but this sounds like a far more involved and not entirely portable solution.

I'm considering closing the PRs unless something better is thought of - I think the route I'm going down is just trading one evil for another, as neither option is actually portable. I want to see this fixed, but I'm no longer convinced the previous route is actually the correct one.

djc commented 7 months ago

Ah, that's fair. Maybe the right fix is to advocate that Dioxus uses the other gear? (On my Mac the gears in your OP look identical anyway -- in the browser, at least.)

TapGhoul commented 7 months ago

The gear looks identical - the problem is that it sends a bunch of newlines because it assumes the gear is 1 wide instead of 2, and as a result generates an extra padding space and thus an unnecessary newline. See here: image

TapGhoul commented 7 months ago

I'm closing the PRs, but leaving the issue open in case someone else wishes to pick up the mantle - I am short on time and consider the PRs unsuitable for fixing the issue, so I have to put this one aside.

mkenigs commented 7 months ago

Is there any way to work around this other than just not using 2-char emojis?

djc commented 7 months ago

I don't think so, but am open to any ideas you have (after reading the discussion here, in #639 and in https://github.com/console-rs/console/pull/210).

Jules-Bertholet commented 6 months ago

unicode-width 1.12 (released today) now supports U+FE0F 2-char emojis, including ⚙️ (skin tone modifiers and ZWJ emojis remain unsupported).

djc commented 6 months ago

@Jules-Bertholet that sounds great. @Silvea12 so can we close this now?

TapGhoul commented 6 months ago

that sounds great. @Silvea12 so can we close this now?

Hey! I've been meaning to get to this, but I've fallen ill with COVID. I'll get back to you on it when I'm better.

For me, a working thing is to install latest indicatif, and then running the above sample code and not seeing the extra newlines. That'll either result in this being closed, or a PR so it can be closed.

I'll get to it once I'm feeling better! Sorry for the delay.

TapGhoul commented 6 months ago

Update: I can confirm a clean install works on my system, where I was experiencing the issue. Closing this, thanks @Jules-Bertholet for the update!

djc commented 6 months ago

Thanks for the confirmation, hope you feel better soon!