jquast / wcwidth

Python library that measures the width of unicode strings rendered to a terminal
Other
393 stars 58 forks source link

`Default_Ignorable_Code_Point`s should all be zero-width #118

Open Jules-Bertholet opened 7 months ago

Jules-Bertholet commented 7 months ago

From https://www.unicode.org/faq/unsup_char.html#3:

All default-ignorable characters should be rendered as completely invisible (and non advancing, i.e. “zero width”), if not explicitly supported in rendering.

However, this library incorrectly considers some of them, for example U+3164 HANGUL FILLER, to have non-zero width.

(There is one exception, where this library is correct in assigning a non-zero width to a Default_Ignorable_Code_Point: U+115F HANGUL CHOSEONG FILLER is meant to be combined with other Hangul jamo to form a width-2 syllable block, so it should be assigned width 2 even though it has no display on its own.)

jquast commented 7 months ago

Thanks, I think this is the same as your other issue, that if I am able to distinguish Default_Ignorable_Code_Point values as zero width it should solve for U+3164 HANGUL FILLER, or I can add it manually.

I agree about some jamo are meant to be combined, and this library assumes as such, see test case:

https://github.com/jquast/wcwidth/blob/056ee4ba0df66fb33be535d8f37470685ef32ba9/tests/test_core.py#L225-L244