golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.88k stars 17.52k forks source link

x/net/idna: backward compatibility issues with change 2 #19821

Closed weppos closed 7 years ago

weppos commented 7 years ago

What version of Go are you using (go version)?

go version go1.7.3 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/weppos/go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.7.3/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.7.3/libexec/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/rk/llwxtd9s6jn91p2ky13tr30c0000gn/T/go-build024013537=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"

What did you do?

This issue is a follow up to #18567. The changes just committed in https://github.com/golang/net/commit/78ebe5c8b6a27ec13109c73c22193d618270e1e1 introduces again a compatibility error. I first discovered it thanks to the test failures in this PR: https://github.com/weppos/publicsuffix-go/pull/64

The issue is that a leading dot is stripped in the input. Here's a code to reproduce it:

package main

import (
    "fmt"
    "golang.org/x/net/idna"
)

func main() {
    var input, output string
    var err error
    input = ".example.com"
    output, err = idna.ToASCII(input)
    fmt.Println("in:\t", input)
    fmt.Println("out:\t", output)
    fmt.Println("err:\t", err)
    input = "..example.com"
    output, err = idna.ToASCII(input)
    fmt.Println("in:\t", input)
    fmt.Println("out:\t", output)
    fmt.Println("err:\t", err)
}

WIth previous version:

➜  gco f2499483f923065a842d38eb4c7f1927e6fc6e6d
➜  gotest go run idna.go
in:  .example.com
out:     .example.com
err:     <nil>
in:  ..example.com
out:     ..example.com
err:     <nil>

With latest version:

➜  gco HEAD
➜  gotest go run idna.go
in:  .example.com
out:     example.com
err:     <nil>
in:  ..example.com
out:     example.com
err:     <nil>

In order to restore the previous functionality, I had to apply a terrible workaround: https://github.com/weppos/publicsuffix-go/pull/66 I'd appreciate suggestions on how to handle this case in a better way, if this bugfix is not going to land in net/idna.

What did you expect to see?

I'd expect the following string conversion:

idna.ToASCII(".example.com") == ".example.com"
idna.ToASCII("..example.com") == "..example.com"

What did you see instead?

idna.ToASCII(".example.com") == "example.com"
idna.ToASCII("..example.com") == "example.com"

/cc @mpvl @nigeltao

mpvl commented 7 years ago

This behavior was added to pass the test set for IDNA 2008 provided by the Unicode consortium. The same behavior is displayed here: http://www.unicode.org/cldr/utility/idna.jsp?a=..example.com. It is also consistent with the behavior of Chrome.

However, one could argue that dots should not be removed and/or that having leading dots is an error. Safari has this behavior.

I suggest making an option for this which is enabled in the Lookup profile, but disabled everywhere else. This would make the Punycode profile (used by idna.ToASCII) compatibly with the old behavior.

gopherbot commented 7 years ago

CL https://golang.org/cl/44380 mentions this issue.

gopherbot commented 7 years ago

CL https://golang.org/cl/44381 mentions this issue.