Open dscrofts opened 8 months ago
Hello, thanks for the report.
I was aware of this issue but there was no bug to track it. I could probably add a simple workaround here in blessed so I will try to do that soon.
I recently added support for Variation Selector-16 (U+FE0F) into wcwidth. But the way that blessed uses this library still gets the calculation wrong (adding each individual codepoint together from wcwidth.wcwidth() function).
I might,
Correct accounting for Emoji that includes U+FE0F is difficult, only 7 terminals support it at last check, i wrote more about it here https://www.jeffquast.com/post/ucs-detect-test-results/, and I've gotten pushback from libvte author used in terminals like Gnome, they refuse to support it at all https://gitlab.gnome.org/GNOME/vte/-/issues/2580 so i've been a bit distracted just trying to get terminal emulators to support it, rather than having blessed support it, but I will definitely get to it soon.
Also to add, I could tell this included U+FE0F by the following commands,
>>> import unicodedata
>>> list(map(unicodedata.name, '🗣️ '))
['SPEAKING HEAD IN SILHOUETTE', 'VARIATION SELECTOR-16', 'SPACE', 'SPACE']
>>> list(map(hex, map(ord, '🗣️ ')))
['0x1f5e3', '0xfe0f', '0x20', '0x20']
Also to add, that python's built-in formatting gets this horribly wrong, it's not aware of emojis, terminal sequences, or even basic east-asian characters like Chinese or Japanese, but in your case it just happens to accidentally get it right :)
I wrote an issue about what it might take to get python's built-in formatting to just account for emoji correctly, https://github.com/jquast/wcwidth/issues/94
Just to add, I added some tests in #275 around ZWJ, pointing out that it gets it wrong. I will continue to work towards a solution for this, I think the wcwidth library needs a kind of iterative parser to correctly solve this in a way that can be integrated into blessed.
Example:
Output (
term.ljust
adds one additional cell):However this is not consistent with all unicode sequences. For example, changing
strings
to["123", "456", "🤔 "]
gives:Output (
term.ljust
padding is correct):