Closed joshuarubin closed 7 years ago
You seems to break some public APIs.
You seems to break some public APIs
Yes, I said as much in the first post.
go-runewidth is depeneded on many packages. See https://godoc.org/github.com/mattn/go-runewidth?importers
So I don't want to change interface.
OK, I backed out the changes that modified the API. The only changes now are to use the unicode9 widths.
SGTM. I'll check this with some projects. please give me a time while.
I tried some projects which using go-runewidth. And seems good to me. BTW, what advantage for you? I'm thinking, to make sure, this change require a terminal which supported unicode9.
The advantage is that wide characters (including emoji, but also for east asian languages) are actually listed as being wide. These tables are used in existing projects like vim who do not require a unicode9 terminal.
For terminals that don’t have unicode9 widths, what happens today is that emoji, for example, are considered, width 1, but drawn (due to font implementations) as width 2. As a result, characters following wide characters are often hidden, or cause the previous character to be hidden, or are simply overlaid. This is obviously wrong.
This change, for terminals that don’t have unicode9 widths, will cause a discrepancy between the terminal's width choice and go-runewidth. However, things that use go-runewidth to determine spacing (projects like tcell) will position characters appropriately, effectively overriding their terminal's (wrong) width tables and fixing rendering issues.
For terminals that have unicode9 widths, this pull request just brings the widths up-to-date so that go-runewidth and the terminal agree.
There should be no problem using these tables in any terminal regardless of their support for unicode9 character widths.
Current statuses for OS/Terminal support. For my information.
https://mail.gnome.org/archives/commits-list/2016-September/msg07491.html
https://gitlab.com/gnachman/iterm2/wikis/unicodeversionswitching
Any update on merging this?
Sorry my delay. I'm okay to merge. But I worry about someone may get wrong.
Updating to latest unicode standard seems like a good thing to do.
FWIW, I tested out the old version vs the new one a few characters, and I can see there's a change. It seems more consistent than before (although still not quite what I was expecting, but that's probably my terminal's fault).
Before:
After:
This seems to have stalled again. Anything I can do to help?
@shurcooL @joshuarubin This PR will be great in CJK. But as far as I can see your image, terminal doesn't handle double width character correctly. If you are okay to merge, I'll do.
yeah, I'd like to know which unicode characters those are, but my guess is that they are either private use characters or not considered wide (because they are not considered emoji or east asian wide). if not either of those, then it may just be that the font used doesn't follow the spec.
ping @nsf termbox-go use go-runewidth. how do you think?
yeah, I'd like to know which unicode characters those are
@joshuarubin, the five characters in that screenshot were x, ✅, ❌, ☑️, and ☑.
If you are okay to merge, I'll do.
@mattn, I have absolutely no objections, as long as this implements a newer Unicode specification. It makes sense to me for go-runewidth to use the latest available Unicode spec.
It actually looks like all the widths are correct according to that test (the terminal is wrong though). The only tricky one is "☑️" which is U+2611 and U+FE0F.
Technically, U+2611 is considered an emoticon according to unicode9, but the convention is that characters below U+1f000 are considered single width as they predated the official emoticon definition (neovim).
So, plain U+2611 is drawn width 1 and calculated to be width 1 correctly.
However, U+2611 and U+FE0F (the latter being a variation selector) is drawn with column width 2 by the font, but calculated to be U+2611 which has width 1 and U+FE0F which is a non-spacing mark width 0.
It would be nice if this sequence of runes were calculated correctly, but at least this pull request is consistent with other projects like neovim and iTerm2.
@mattn Well, I think small hacks for weird runes are okay as long as they don't touch the others, if they cause some problems you can always revert the change, I don't really use CJK and termbox myself that much, it's all up to people who use these things.
@nsf if you're referring to a hack to make U+2611 followed by U+FE0F calculated as 2 columns, I would have to disagree.
We want to be as consistent as possible with other projects. Rendering problems due to inconsistencies between column calculations of varying software is very hard to deal
For example, terminal+tmux+vim (and also the shell) all have to agree on character width or else there are going to be significant rendering issues.
As demonstrated above, these tables aren't perfect, but they are consistent with other projects which I think makes good sense to prefer.
sure, consistency is good
Thanks!
Thanks for doing this!
Update runewidth to use unicode9 character width tables. This is the default in vim and neovim now, so should be safe to use in any terminal.
The
Condition
is no longer calculated usingIsEastAsian()
as terminals do not use locale to determine how wide to draw ambiguous with characters. Rather, they simply default to 1, and may offer an option to set to 2 (which is discouraged).All tests still pass. The API is very close to the way it was, but not identical due to the change in
Condition
.I recognize this is a fairly large diff, so I'd be happy to work with you in any way you feel best to get this merged.