rivo / uniseg

Unicode Text Segmentation, Word Wrapping, and String Width Calculation in Go
MIT License
585 stars 61 forks source link

Allow configuring the width of East Asian ambiguous width characters #47

Closed junegunn closed 8 months ago

junegunn commented 8 months ago

Would you be willing to accept a patch that allows changing the expected width of East Asian ambiguous width characters?

Similarly to RUNEWIDTH_EASTASIAN=1 of go-runewidth or set ambiwidth=double of Vim.

rivo commented 8 months ago

Please explain why (and where) this is needed.

junegunn commented 8 months ago

I'm replacing mattn/go-runewidth with rivo/uniseg in fzf. One difference I noticed is that it's not possible with uniseg to configure how ambiguous east asian characters should be treated. It always reports their width as 1.

But in some CJK (East Asian) fonts, those ambiguous characters take up 2 columns, so if the library reports the width as 1, users using those fonts will have rendering problems like the one shown here.

So go-runewidth provides RUNEWIDTH_EASTASIAN=1 flag so that the library reports the width as 2. Vim does the same thing via set ambiwidth=double.

I personally don't have this problem, it only seems to be an issue of CJK users on Windows environment.

rivo commented 8 months ago

As far as I remember, the reason I set them to a width of 1 was this statement in Unicode Standard Annex #11:

If [ambiguous width characters] are not used in context of the specific legacy encoding they belong to, their width resolves to narrow.

I think for now, the crucial point is this:

I personally don't have this problem, it only seems to be an issue of CJK users on Windows environment.

The way I handle these things is that I wait for someone to actually have this problem with uniseg. But before that happens, I will avoid adding additional parameters and keep the complexity as low as possible.

That's not to say I won't come back to this at some point in the future. But for now, it seems to me that this is a rare edge case which the users of this package haven't encountered yet.

junegunn commented 8 months ago

Fair enough. Thank you for your consideration. The patch is simple and I have no problem applying it to my fork, so feel free to close this.

FWIW, go-runewidth automatically detects the "wideness" by looking at the locale setting of the terminal,

but I suspect this has caused more problems than it has solved. Some users had to set RUNEWIDTH_EASTASIAN=0 to reverse it.

Granted, we don't know how many users actually benefit from the detection because they don't come and say they are well and fine.

rivo commented 8 months ago

I have no problem applying it to my fork

So you do need this feature? I thought you mentioned that you don't have this problem.

Granted, we don't know how many users actually benefit from the detection because they don't come and say they are well and fine.

That's true. In my experience, however, they will let you know when it doesn't work.

junegunn commented 8 months ago

So you do need this feature? I thought you mentioned that you don't have this problem.

Sorry, I wasn't clear enough. For my personal use, no. But as the maintainer of fzf, which is a fairly popular program, I need to provide a way to configure the behavior in case there are users on such terminal environments. But I figured you would be reluctant to merge this, which is completely understandable, so I decided to build the program using the fork for the time being. The patch is simple, so even if you decide not to merge this, I can still maintain my fork. So no pressure.


It looks like the author of go-runewidth needed this behavior himself, so he came up with the logic.

https://github.com/mattn/go-runewidth/issues/14#issuecomment-378865129

rivo commented 8 months ago

So I assume your users have run into this issue before then? Or do you want to preemptively add this feature?

I will leave a comment/request for this PR and then I might actually merge it. It sounds like it is important enough to include it.

junegunn commented 8 months ago

So I assume your users have run into this issue before then? Or do you want to preemptively add this feature?

Replacing go-runewidth with uniseg means that fzf is removing the auto-detection of the setting. So, the users who have benefited from it (don't know how many there are) will notice that the rendering is broken when they upgrade fzf. And I didn't want to tell them that they're out-of-luck, just because the new library fzf uses doesn't support the configuration. Instead, "So, yeah, the auto-detection is gone, but you can use fzf --ambidouble."

Having said that, no one has complained so far after a few days since the release. Maybe there were only a few users who benefited from the auto-detection, so no one ever ran into the problem, or maybe they figured it out themselves from the release note. I don't know.

Added the comment as you requested.

junegunn commented 8 months ago

Putting fzf aside, I think it makes sense for this library to provide the option, because we know there are terminals that behave differently.

rivo commented 8 months ago

Makes sense. I merged it.

junegunn commented 8 months ago

Thank you!