mattn / go-runewidth

wcwidth for golang
MIT License
609 stars 94 forks source link

Feature request: Add support for zero-width-joiners #20

Closed rivo closed 6 years ago

rivo commented 6 years ago

It would be great if you could add support for zero-width joiners (ZWJ). I have the following code example which doesn't work as expected:

package main

import (
    "fmt"

    runewidth "github.com/mattn/go-runewidth"
)

func main() {
    e := "πŸ‘¨β€πŸ‘¨β€πŸ‘§"
    r := []rune(e)
    var widths []int
    for _, c := range r {
        widths = append(widths, runewidth.RuneWidth(c))
    }
    fmt.Printf("%s : len=%d numrunes=%d width=%d widths=%v runes=%X\n", e, len(e), len(r), runewidth.StringWidth(e), widths, r)
}

The output is:

πŸ‘¨β€πŸ‘¨β€πŸ‘§ : len=18 numrunes=5 width=6 widths=[2 0 2 0 2] runes=[1F468 200D 1F468 200D 1F467]

Specifically, width should be 2 instead of 6. I found this article which explains how they work. It does not only affect emojis but also characters in some languages.

This came up in rivo/tview#161. It would be great if support for ZWJ could be added so I can implement support for these Unicode characters in tview. I understand that not all kinds of combinations are supported and it's probably difficult to figure out which ones are. But assuming these characters are supported will help a lot. I don't expect users to try to print ZWJ combinations which are not supported anyway.

Thanks!

rivo commented 6 years ago

Some related discussion here also: gdamore/tcell#233

mattn commented 6 years ago

Do you mind that I'll add new API for this?

rivo commented 6 years ago

I'm not sure what you mean by that exactly. For me, it would be best if StringWidth() worked as it does now but took zero-width joiners into account.

mattn commented 6 years ago

addressed in https://github.com/mattn/go-runewidth/pull/21

mattn commented 6 years ago

But I wonder this flag ZeroWidthJoiner should be always true since my environment (gnome-terminal) does not support ZWJ. πŸ‘¨β€πŸ‘¨β€πŸ‘§ is 3 characters on gnome-terminal.

rivo commented 6 years ago

Thanks!

I understand it's probably difficult to make this work consistently on all terminals. @gdamore hinted at this in https://github.com/gdamore/tcell/issues/233#issuecomment-418905440. I suppose that making it work on "most" terminals will go a long way.

(I guess the question is, who uses these kind of emojis in a terminal in the first place?)

gdamore commented 6 years ago

Damned few people. Frankly I think ZWJ is really new stuff, and a bit bleeding edge. I would not be completely unhappy if we took a wait-and-see approach for a bit to let this mature a bit. In the meantime assuming that ZWJ combine characters so that they have the same cell width as their primary character is probably the best solution. We can add special cases (database) when it is clear that the composed result always has a different width. (I think there are some cases for ZWJ that are used in actual languages and not just emoticons, where the joined result occupies two cells instead of just one.)

mattn commented 6 years ago

merged https://github.com/mattn/go-runewidth/pull/21 but ZeroWidthJoiner is false in default.