golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.38k stars 17.71k forks source link

x/net/idna: support label separators other than ASCII dot #19603

Open hnakamur opened 7 years ago

hnakamur commented 7 years ago

What version of Go are you using (go version)?

go version go1.8 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/admin/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build229576781=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

Passed a string "example\u3002jp" to ToASCII().

What did you expect to see?

The return value is "example.jp"

What did you see instead?

The return value was "xn--examplejp-ck3h"

https://tools.ietf.org/html/rfc3490#section-3.1

3.1 Requirements

  IDNA conformance means adherence to the following four requirements:

  1) Whenever dots are used as label separators, the following
     characters MUST be recognized as dots: U+002E (full stop), U+3002
     (ideographic full stop), U+FF0E (fullwidth full stop), U+FF61
     (halfwidth ideographic full stop).

I created a fix and add test cases at https://github.com/hnakamur/net/commit/bd2fe133f3df97090c43d065b739774b900f67c9 I also followed the steps at Contribution Guide - The Go Programming Language and am ready to run git mail if this fix looks good to reviewers.

Thanks!

ALTree commented 7 years ago

This looks reasonable; at this point I would just mail the patch for review (people usually don't look at patches on the issue tracker).

hnakamur commented 7 years ago

Thanks for your comment. I ran git change and git mail. https://go-review.googlesource.com/c/38284/

mpvl commented 7 years ago

This is now supported if you use any of the non-default profiles, which use the UTS #46 mappings pre-split so will normalize the dots any many other things you should do for proper IDNA support.

Would this be sufficient or should one also allow splitting on dots when one is using the raw Punycode profile?

hnakamur commented 7 years ago

@mpvl Thanks for your comment. I've updated the test case.

https://go-review.googlesource.com/c/38284/#message-4bfc92dfa0159b20d2bd3cdf1d7249a1c324fc33

Would this be sufficient or should one also allow splitting on dots when one is using the raw Punycode profile?

After reading the https://tools.ietf.org/html/rfc3490#section-4, I think we must normalize dots in raw Punycode profile. Also also the Registration profile must normalize dots instead of returning an error like idna: disallowed rune U+006A.