unicode-rs / unicode-width

Displayed width of Unicode characters and strings according to UAX#11 rules.
https://unicode-rs.github.io/unicode-width
Other
197 stars 23 forks source link

unicode-width should retain a semi-stable width function #64

Open Manishearth opened 1 week ago

Manishearth commented 1 week ago

As described in https://github.com/unicode-rs/unicode-width/issues/55, 0.1.13 changed the predicted widths of a lot of characters to be more accurate for things like emoji.

From that issue, and from the breakage in rustfmt (https://github.com/rust-lang/rustfmt/issues/6203), it feels useful to have an API that is explicitly stable, and documented as such. Probably one that did the original less-useful-but-"still works" thing of relying on East Asian Width properties only.

Manishearth commented 1 week ago

@Jules-Bertholet how hard would it be for you to add a width_stable?

Manishearth commented 1 week ago

I think the easy way to do this with the traits would be to add a UnicodeStableWidth trait. We can clean this up in a future breaking release.

It's actually unclear to me why we have UnicodeWidthStr and UnicodeWidthChar.

Jules-Bertholet commented 1 week ago

For perfect stability, we'd have to pin a Unicode version, as the underlying Unicode properties are not stable (even for assigned codepoints).

Manishearth commented 1 week ago

I'm fine with it being just EAW and only subject to change with Unicode EAW changes, which change extremely rarely.

Jules-Bertholet commented 1 week ago

We'd probably want EAW + Default_Ignorable_Code_Point + Grapheme_Extend + Hangul jungseong/jongseong (which is more or less what earlier versions of this crate did). It might be better to release it as a separate crate though?

Manishearth commented 1 week ago

Yeah, sorry, I mean "what we did before", especially since rustfmt relies on this crate.

I think given where this crate is in the ecosystem I'd prefer for this option to be readily available.