mattn / go-runewidth

wcwidth for golang
MIT License
609 stars 94 forks source link

possible regression ? #32

Closed MichaelMure closed 5 years ago

MichaelMure commented 5 years ago

Hi,

Updating go-runewidth from v0.0.4 to v0.0.5 break my tests in https://github.com/MichaelMure/go-term-text. go-term-text is a package doing text formatting for the terminal, relying on go-runewidth to get the character width.

Here is example of before/after: image

image

Notice that after switching to 0.0.5, the text go further than it should. As the algorithm remain unchanged, I suspect go-runewidth return a different length. Would that be possible ? If so, why ?

hymkor commented 5 years ago

The rune (\uFF0C) 's width changes from 2 to 1. I can not judge which width should be.


package main

import (
    "fmt"
    "github.com/mattn/go-runewidth"
)

var s = []string{
    "婞一枳郲逴靲屮蜧曀殳,掫乇峔掮傎溒兀緉冘仜。郼牪艽螗媷錵朸一詅掜豗怙刉笀丌,楀棶乇矹迡搦囷圣亍昄漚粁仈祂。覂一洳袶揙楱亍滻瘯毌,掗屮柅軡菵腩乜榵毌夯。勼哻怌婇怤灟葠雺奷朾恦扰衪岨坋誁乇芚誙腞。冇笉妺悆浂鱦賌廌灱灱觓坋佫呬耴跣兀枔蓔輈。嵅咍犴膰痭瘰机一靬涽捊矷尒玶乇,煚塈丌岰陊鉖怞戉兀甿跾觓夬侄。棩岧汌橩僁螗玎一逭舴圂衪扐衲兀,嵲媕亍衩衿溽昃夯丌侄蒰扂丱呤。毰侘妅錣廇螉仴一暀淖蚗佶庂咺丌,輀鈁乇彽洢溦洰氶乇构碨洐巿阹。",
    `    婞一枳郲逴靲屮蜧曀殳,掫乇峔掮傎溒兀緉冘仜。郼牪艽螗媷
    錵朸一詅掜豗怙刉笀丌,楀棶乇矹迡搦囷圣亍昄漚粁仈祂。覂
    一洳袶揙楱亍滻瘯毌,掗屮柅軡菵腩乜榵毌夯。勼哻怌婇怤灟
    葠雺奷朾恦扰衪岨坋誁乇芚誙腞。冇笉妺悆浂鱦賌廌灱灱觓坋
    佫呬耴跣兀枔蓔輈。嵅咍犴膰痭瘰机一靬涽捊矷尒玶乇,煚塈
    丌岰陊鉖怞戉兀甿跾觓夬侄。棩岧汌橩僁螗玎一逭舴圂衪扐衲
    兀,嵲媕亍衩衿溽昃夯丌侄蒰扂丱呤。毰侘妅錣廇螉仴一暀淖
    蚗佶庂咺丌,輀鈁乇彽洢溦洰氶乇构碨洐巿阹。`,
}

func main() {
    for _, s1 := range s {
        for _, c1 := range s1 {
            length := runewidth.RuneWidth(c1)
            fmt.Printf("len=%d [%c] \\u%X\n", length, c1 ,c1)
        }
    }
}
11c11
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
38c38
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
64c64
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
133c133
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
164c164
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
195c195
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
225c225
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
258c258
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
290c290
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
371c371
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
414c414
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
451c451
< len=2 [,] \uFF0C
---
> len=1 [,] \uFF0C
hymkor commented 5 years ago

I wrote the pull request (#33) for this issue. Would you like to see it ? > @mattn

MichaelMure commented 5 years ago

Thank you for looking into it.

I don't really understand unicode, but should'nt a run called FULLWIDTH COMMA be considered to have a width of 2 ?

Woops, didn't see the update before posting.

MichaelMure commented 5 years ago

I can confirm that #33 solve my problem.