golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.56k stars 17.61k forks source link

net: isDomainName rejects valid domains #17659

Open gibson042 opened 7 years ago

gibson042 commented 7 years ago

isDomainName is limited to hostname-compatible "preferred name" LDH labels, with an exception made for underscores per #1167. But RFC 2181 is clear about any octet being valid in a DNS label:

Those [length] restrictions aside, any binary string whatever can be used as the label of any resource record.

What version of Go are you using (go version)?

go1.7.1

What operating system and processor architecture are you using (go env)?

linux/amd64 GOARCH="amd64" GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux" CC="gcc" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build693750050=/tmp/go-build -gno-record-gcc-switches" CXX="g++" CGO_ENABLED="1"

What did you do?

query uncommon domain names ```go package main import ( "fmt" "net" ) func main() { domains := []string{ `*.golang.org`, `\.\.\..golang.org`, `\065.golang.org`, `\000.golang.org`, } for _, domain := range domains { fmt.Printf("\n%s\n", domain) fmt.Println(net.LookupHost(domain)) } } ```

What did you expect to see?

successful DNS queries ``` *.golang.org [216.58.219.209 2607:f8b0:4006:80e::2011] \.\.\..golang.org [216.58.219.209 2607:f8b0:4006:80e::2011] \065.golang.org [216.58.219.209 2607:f8b0:4006:80e::2011] \000.golang.org [216.58.219.209 2607:f8b0:4006:80e::2011] ```

What did you see instead?

valid names were rejected ``` *.golang.org [] lookup *.golang.org: invalid domain name \.\.\..golang.org [] lookup \.\.\..golang.org: no such host \065.golang.org [216.58.219.209 2607:f8b0:4006:80e::2011] \000.golang.org [] lookup \000.golang.org: no such host ```
quentinmit commented 7 years ago

/cc @rsc in case he disagrees

rsc commented 7 years ago

The hard part here is that it seems to be the case that not any binary name should be accepted. Otherwise thing like Punycode would not exist. What is the dividing line for what can be looked up directly and what needs special processing?

rsc commented 7 years ago

/cc @mdempsky @mikioh @mpvl @pmarks-net

cznic commented 7 years ago

The hard part here is that it seems to be the case that not any binary name should be accepted.

Not according to RFC1035. However, RFC 2181 updates RFC1035 in the way @gibson042 reported, ie. a label can consist of any octets. RFC4343, Section 2 discusses handling and escaping non-ASCII labels.

mikioh commented 7 years ago

https://tools.ietf.org/html/rfc6055#section-3 shows some discussion on the use of non-LDH labels in DNS.

bradfitz commented 7 years ago

Thanks for the links, but looks like there's still no decision here. Bumping to Go 1.10.

/cc @mdempsky

gopherbot commented 6 years ago

Change https://golang.org/cl/73830 mentions this issue: net: recognize non-LDH domain names as valid per RFC 1035

gibson042 commented 6 years ago

I finally got around to fixing this—the code should clarify any open questions, but I'm also happy to address them directly. https://go-review.googlesource.com/c/go/+/73830

rsc commented 6 years ago

It looks like we are still waiting on a proposed answer for the question above: "What is the dividing line for what can be looked up directly and what needs special processing?"

Can anyone give an authoritative answer or at least make a good suggestion?

/cc @mdempsky @mikioh @mpvl @pmarks-net

pmarks-net commented 6 years ago

I don't really understand the naming rules well enough to provide a specific answer, but my general opinion is that the DNS client library should err on the side of being permissive, unless there's a compelling reason not to.

horgh commented 6 years ago

I don't have an answer to the question about what needs special processing, but I wanted to offer support for the idea of being permissive in the isDomainName() check:

While it does appear invalid, in practice in the wild there are hosts starting with hyphens. For example, -inevitablelament.tumblr.com. Many resolvers (two examples I'm aware of are glibc (some versions) and Net::DNS::Resolver) happily resolve these. Because of this, I think it would be good if the net resolver did as well.

Although looking at the linked changeset, it looks like being more permissive is the direction we're going, so that's good!

gibson042 commented 6 years ago

Because input domain names are assumed to be in common display format (as it is called in the current DNS Terminology draft), special processing is required for U+005C REVERSE SOLIDUS \ (which begins an escape sequence) and for any code point greater than U+007F DELETE (which is the highest code point with a single-octet UTF-8 representation)—and of course for U+002E FULL STOP ., (which separates labels). All other runes can be unambiguously used in a wire format QNAME, and their treatment is therefore purely a matter of taste.

Accepting all remaining visible ASCII characters lacking special meaning in the DNS should be non-controversial, and in https://go-review.googlesource.com/c/go/+/73830 I also decided to allow just about every ASCII character when backslash-prefixed (but could be talked into changing the exact set). Regardless, it MUST be possible to specify any octet with \DDD decimal escape sequences.

mikioh commented 6 years ago

What is the dividing line ...

I think that the desirable line is already described in RFC 6055.

IMHO, at least the function isDomainName should be replaced with a series of LDHLabels, ASCIILabels and NetUnicodeLabels (I don't want to use the terminology of IDNA such as A-Label and U-Label here because NetUnicodeLabels might accept non-NFC UTF-8 or UTF-16/32 labels) functions and we probably should provide some control knob to users for choosing the namespace label encoding:

If you want to add more additional control knob for IDN support, that's probably fine, but please be careful not conflating NetUnicode or other UTF stuff and IDNA-valid U-labels.

gopherbot commented 6 years ago

Change https://golang.org/cl/99623 mentions this issue: dns/dnsmessage: fix handling of non-LDH domain names

snowcrumble commented 6 years ago

In kubernetes , isDomainName reject Pods "A Record" "pod-ip-address" , for example: "172-17-0-16", as RFC 3696 section 2 requires "top-level domain names not be all-numeric", but this example has three hyphen, so I think it should not be reject.

bradfitz commented 6 years ago

@SnowCrumble, I sent https://go-review.googlesource.com/c/go/+/136900 for that.

gopherbot commented 6 years ago

Change https://golang.org/cl/136900 mentions this issue: net: don't reject domain names with only numbers and hyphens

mdempsky commented 5 years ago

Regardless, it MUST be possible to specify any octet with \DDD decimal escape sequences.

RFC 1035 § 5 is about master files, which package net does not support.

Is there another RFC that extends the requirement for supporting \DDD escapes to client resolver libraries like package net?

gibson042 commented 5 years ago

RFC 1035 § 5 is about master files, which package net does not support.

Is there another RFC that extends the requirement for supporting \DDD escapes to client resolver libraries like package net?

Over the years, the ASCII syntax defined in that section has been generalized into what is now known as "presentation format"¹. As far as I know, it is the only valid text-safe means of representing an arbitrary domain name. When you say "package net does not support" it, I think you mean that it supports only a subset—and if that is indeed intentional, then the gap should be documented and the error messages should be normalized and corrected.

But that seems like a mistake to me... what are the reasons for not allowing Lookup methods to resolve arbitrary domain names?


¹ In the context of DNS, the earliest use of that term I could find is in in RFC 3658 (before that it was still "Master File Format" as in RFC 1706, with a transition beginning sometime around RFC 2535).

bradfitz commented 5 years ago

I haven't been following this bug closely, but:

But that seems like a mistake to me... what are the reasons for not allowing Lookup methods to resolve arbitrary domain names?

Obviously we weren't trying to make Lookup fail to resolve arbitrary domain names. More than likely we tried to add some validation somewhere and got it wrong and ended up too aggressive.

bradfitz commented 5 years ago

The hard part here is that it seems to be the case that not any binary name should be accepted. Otherwise thing like Punycode would not exist. What is the dividing line for what can be looked up directly and what needs special processing?

I saw this guide to modern DNS recently which ignores all the old parts of DNS RFCs which are no longer accurate:

https://powerdns.org/hello-dns/

Maybe the answer to this question is in there somewhere.

gibson042 commented 5 years ago

Punycode exists because although any octet is permissible in a DNS label, there is no guidance on how to interpret octets 128 through 255. It is definitively not acceptable to assume that they are UTF-8, and any domain name including them, although valid, is not an Internet host name because it does not conform to the letter-digit-hyphen "preferred name syntax" (but note that underscore names such as _jabber._tcp.example.com do not conform to it either—they are not "host names" but nonetheless need to remain resolvable, which is why Go already makes an exception as of 2408a4bbbd8db78c890b9911868fb4face7558d7 to explicitly permit underscores). Punycode is a non-UTF-8 encoding of non-ASCII strings into ASCII, with the express purpose of representing internationalized domain names as true LDH host names.

But all of that, while interesting for background purposes, does not change the need to respect arbitrary octets in domain name labels as required by RFC 2181. I really don't know how to be more clear in demonstrating that such is the case.

ghost commented 3 years ago

Hello, are there any new developments on this?