google / go-cmp

Package for comparing Go values in tests
BSD 3-Clause "New" or "Revised" License
4.08k stars 209 forks source link

Output of comparing two Japanese words is unreadable #314

Open k3forx opened 1 year ago

k3forx commented 1 year ago

Summary

When I want to compare two Japanese words and there is diff, the returned value of Diff method is unreadable.

Detail

Here is an example.

func Echo() string {
    return "プライベート ブランド シャツ"
}

test code

func TestEcho(t *testing.T) {
    expected := "プライベート ブランド ジャケット"
    actual := Echo()
    if diff := cmp.Diff(expected, actual); diff != "" {
        t.Errorf("%s result mismatch (-want, +got):\n%s", t.Name(), diff)
    }
}

cmp.Diff produces

--- FAIL: TestEcho (0.00s)
    main_test.go:13: TestEcho result mismatch (-want, +got):
          strings.Join({
                "プライベート ブランド \xe3\x82",
        -       "\xb8ャケット",
        +       "\xb7ャツ",
          }, "")
FAIL
FAIL    cmpbug  0.347s
FAIL

the above output is not readable. I expect the following output

❯ go test ./...
--- FAIL: TestEcho (0.00s)
    main_test.go:13: TestEcho result mismatch (-want, +got):
          string(
                 "プライベート ブランド ",
        -       "ジャケット",
        +       "シャツ",
          )
FAIL
FAIL    cmpbug  0.229s
FAIL

version info

dsnet commented 1 year ago

Thanks for the bug report. We're performing a diff on byte boundaries rather than rune boundaries. We can fix the heuristic for this where we use rune diffing if the strings are valid UTF-8.