golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.84k stars 17.65k forks source link

net: should expand IP address 1.1 to 1.0.0.1 #36822

Open perpetual-hydrofoil opened 4 years ago

perpetual-hydrofoil commented 4 years ago

nameserver 1.1 in /etc/resolv.conf not parsed

What did you do?

go get github.com/kevinburke/twilio-go

What did you see instead?

go get github.com/kevinburke/twilio-go: module github.com/kevinburke/twilio-go:
Get https://proxy.golang.org/github.com/kevinburke/twilio-go/@v/list:
dial tcp: lookup proxy.golang.org on [::1]:53: dial tcp [::1]:53:
connect: connection refused

Workaround

Change 1.1 to 1.0.0.1 or 1.1.1.1 in /etc/resolv.conf

# /etc/resolv.conf
nameserver 1.1
davecheney commented 4 years ago

Looking at http://man7.org/linux/man-pages/man5/resolv.conf.5.html, name server is defined to be an IPV4 or IPV6 address, nameserver 1.1 appears to be neither so I don't see how it could be used as a DNS server.

Why did you expect that Go would be able to use 1.1 as a nameserver?

perpetual-hydrofoil commented 4 years ago

1.1 should expand to 1.0.0.1

Examples:

Try putting nameserver 1.1 into resolv.conf and then ping something; your resolver should properly use that resolver.

Or, just ping 1.1 or traceroute 1.1, or ping 127.1. On Linux (but apparently not Mac), you can even ping 0 (which expands to 0.0.0.0)

CloudFlare's public nameservers are documented at: https://1.1/ (which should be a working link in your browser)

dig google.com @1.1
....
;; ANSWER SECTION:
google.com.     120 IN  A   172.217.6.142
....
;; SERVER: 1.0.0.1#53(1.0.0.1)

The expansion rules are covered here (part of POSIX / IEEE 1003.1).

http://man7.org/linux/man-pages/man3/inet_addr.3.html

#! /usr/bin/env python

import socket
print(socket.gethostbyname('1.1'))
# '1.0.0.1'

It appears that net does not properly expand and parse IPv4 in this format, so that might be the underlying cause here.

ianlancetaylor commented 4 years ago

This is part of POSIX (IEEE 1003.1).

Do you have a citation for that? I didn't think that IEEE 1003.1 talked about networking at all. Thanks.

perpetual-hydrofoil commented 4 years ago
package main

import (
    "fmt"
    "net"
    "os"
)

func main() {
    ip := net.ParseIP("1.1")
    if ip.To4() == nil {
        fmt.Printf("IPv4 expansion is not working.")
        os.Exit(1)
    }
}
rhedile commented 4 years ago

Sorry Ian, but technically he is right. Couldn't find a way to link into the relevant part of the spec. Rgds, Nigel Vickers

The Open Group Base Specifications Issue 7, 2018 edition IEEE Std 1003.1™-2017 (Revision of IEEE Std 1003.1-2008) Copyright © 2001-2018 IEEE and The Open Group

include <arpa/inet.h>

in_addr_t inet_addr(const char cp); char inet_ntoa(struct in_addr in);

DESCRIPTION The inet_addr() function shall convert the string pointed to by cp, in the standard IPv4 dotted decimal notation, to an integer value suitable for use as an Internet address.

The inet_ntoa() function shall convert the Internet host address specified by in to a string in the Internet standard dot notation.

The inet_ntoa() function need not be thread-safe.

All Internet addresses shall be returned in network order (bytes ordered from left to right).

Values specified using IPv4 dotted decimal notation take one of the following forms:

a.b.c.d When four parts are specified, each shall be interpreted as a byte of data and assigned, from left to right, to the four bytes of an Internet address. a.b.c When a three-part address is specified, the last part shall be interpreted as a 16-bit quantity and placed in the rightmost two bytes of the network address. This makes the three-part address format convenient for specifying Class B network addresses as "128.net.host". a.b When a two-part address is supplied, the last part shall be interpreted as a 24-bit quantity and placed in the rightmost three bytes of the network address. This makes the two-part address format convenient for specifying Class A network addresses as "net.host". a When only one part is given, the value shall be stored directly in the network address without any byte rearrangement. All numbers supplied as parts in IPv4 dotted decimal notation may be decimal, octal, or hexadecimal, as specified in the ISO C standard (that is, a leading 0x or 0X implies hexadecimal; otherwise, a leading '0' implies octal; otherwise, the number is interpreted as decimal).

On Tue, 28 Jan 2020 at 16:25, Ian Lance Taylor notifications@github.com wrote:

This is part of POSIX (IEEE 1003.1).

Do you have a citation for that? I didn't think that IEEE 1003.1 talked about networking at all. Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/36822?email_source=notifications&email_token=ABAB73CP5DM7F6MOV672SZ3RABE5JA5CNFSM4KMLTTQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKDWG6A#issuecomment-579298168, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAB73GUHRTR33H4XTS7SULRABE5JANCNFSM4KMLTTQQ .

ianlancetaylor commented 4 years ago

Thanks for the pointer.

davecheney commented 4 years ago

Indeed, TIL!

ghost commented 4 years ago

http://1089054054/

davecheney commented 4 years ago

Well, that's just about all the internet I can handle today. Good evening

epelc commented 4 years ago

Are there any security issues related to this potentially changing behavior in higher level packages(net/http, etc)? I think it may be a good idea to make a note about the potential for SSRF and other security problems on the release notes if/when this is changed.

ie if you are filtering some internal address like 192.168.0.2 you could potentially expand via 192.168.2 and a simple filter might not catch it. I know best practice is to use proxy hosts setup on separate networks/subnets but it's been shown in the past with major programs(github enterprise) that many webhook implementations skip this or use alternative methods.

https://www.blackhat.com/docs/us-17/thursday/us-17-Tsai-A-New-Era-Of-SSRF-Exploiting-URL-Parser-In-Trending-Programming-Languages.pdf

FatalReadError commented 4 years ago

Good point (and that's an awesome presentation.)

I'm not clear on what exact attack vector you're suggesting here -- are we talking about URL routing with embedded IP's, general input sanitation, other protocols, etc. It seems like most naive implementations that would be expecting an IPv4 at all should also enforce a dotted-quad input sanitation filter, since any developer that would just allow untrusted input without even the bare minimum of bounds checking for some sort of properly formed IP (whatever that means to the developer!) will probably have far larger issues anyway.

Even if the developer has some inscrutable take on what an IPv4 "looks" like, in practice this expansion isn't really much different from how you can drop octets in IPv6, so perhaps the behavior should match whatever is currently done for IPv6 octets.

iwdgo commented 3 years ago

One POSIX online reference is on opengroup Searching for inet_addr returns the page pasted above.

gopherbot commented 3 years ago

Change https://golang.org/cl/268259 mentions this issue: net: expand IP when octets are missing

odeke-em commented 3 years ago

Punting to Go1.17, as per guidance on the CL after the need for clarification.

AkihiroSuda commented 3 years ago

In addition, net.ParseIP() should support hex and octal form.

e.g., "0x7F.010" should correspond to "127.0.0.8".

$ wget http://0x7F.010
--2021-03-13 16:38:50--  http://0x7f.010/
Resolving 0x7f.010 (0x7f.010)... 127.0.0.8
...
AkihiroSuda commented 3 years ago

Punting to Go1.17, as per guidance on the CL after the need for clarification.

The "7.4. Rare IP Address Formats" section of RFC3986 has explanation about the rare formats, though not formal definition: https://tools.ietf.org/html/rfc3986#section-7.4

> **7.4. Rare IP Address Formats** > Although the URI syntax for IPv4address only allows the common dotted-decimal form of IPv4 address literal, many implementations that process URIs make use of platform-dependent system routines, such as gethostbyname() and inet_aton(), to translate the string literal to an actual IP address. Unfortunately, such system routines often allow and process a much larger set of formats than those described in Section 3.2.2. > For example, many implementations allow dotted forms of three numbers, wherein the last part is interpreted as a 16-bit quantity and placed in the right-most two bytes of the network address (e.g., a Class B network). Likewise, a dotted form of two numbers means that the last part is interpreted as a 24-bit quantity and placed in the right-most three bytes of the network address (Class A), and a single number (without dots) is interpreted as a 32-bit quantity and stored directly in the network address. Adding further to the confusion, some implementations allow each dotted part to be interpreted as decimal, octal, or hexadecimal, as specified in the C language (i.e., a leading 0x or 0X implies hexadecimal; a leading 0 implies octal; otherwise, the number is interpreted as decimal). > These additional IP address formats are not allowed in the URI syntax due to differences between platform implementations. However, they can become a security concern if an application attempts to filter access to resources based on the IP address in string literal format. If this filtering is performed, literals should be converted to numeric form and filtered based on the numeric value, and not on a prefix or suffix of the string form.

fis commented 2 years ago

FWIW, I don't think POSIX has really made up its mind on this consistently either:

inet_addr, inet_ntoa: https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/functions/inet_addr.html

Values specified using IPv4 dotted decimal notation take one of the following forms: a.b.c.d / a.b.c / a.b / a

inet_ntop, inet_pton: https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/functions/inet_pton.html

If the af argument of inet_pton() is AF_INET, the src string shall be in the standard IPv4 dotted-decimal form: ddd.ddd.ddd.ddd where "ddd" is a one to three digit decimal number between 0 and 255 (see inet_addr). The inet_pton() function does not accept other formats (such as the octal numbers, hexadecimal numbers, and fewer than four numbers that inet_addr() accepts).

You could argue that net.ParseIP is closer to inet_pton in spirit, given that it supports both IPv4 and IPv6, and also does not accept the "rare" formats. Sticking to a stricter syntax would also be in line with how issue #30999 was resolved.

RFC 6943 section 3.1.1 referred in that issue also talks of this distinction between "strict" and "loose" forms:

> In specifying the inet_addr() API, the Portable Operating System Interface (POSIX) standard [IEEE-1003.1] defines "IPv4 dotted decimal notation" as allowing not only strings of the form "10.0.1.2" but also allowing octal and hexadecimal, and addresses with less than four parts. For example, "10.0.258", "0xA000102", and "012.0x102" all represent the same IPv4 address in standard "IPv4 dotted decimal" notation. We will refer to this as the "loose" syntax of an IPv4 address literal. > In Section 6.1 of [RFC3493], getaddrinfo() is defined to support the same (loose) syntax as inet_addr(): [--] In contrast, Section 6.3 of the same RFC states, specifying inet_pton(): [--] > As shown above, inet_pton() uses what we will refer to as the "strict" form of an IPv4 address literal. Some platforms also use the strict form with getaddrinfo() when the AI_NUMERICHOST flag is passed to it. > Both the strict and loose forms are standard forms, and hence a protocol specification is still ambiguous if it simply defines a string to be in the "standard IPv4 dotted decimal form". [--] > [--] New protocols and data formats should similarly consider using the strict form rather than the loose form in order to better match user expectations. [--]
codewinch commented 2 years ago

The RFC 6943 reference is very interesting, especially that part you quoted:

Both the strict and loose forms are standard forms

favonia commented 1 year ago

The RFC 6943 reference is very interesting, especially that part you quoted:

Both the strict and loose forms are standard forms

The RFC 6943 mentions this because it wanted to point out the imprecise wording in other protocols, not that it was trying to defend the loose form. In fact, the RFC 6943 is arguably against using the loose form. It says:

New protocols and data formats should similarly consider using the strict form rather than the loose form in order to better match user expectations.

favonia commented 1 year ago

If we wish to parse the loose form, I personally hope we could keep a function recognizing only the strict form to implement various protocols more faithfully (including the URIs mentioned above---http://1089054054/ probably should not work, even if many implementations support it).

Two security concerns could be related here, though the first one is not limited to the loose form:

  1. The first concern is the ambiguity between "registered names" (often resolved via DNS) and IPv4 addresses. This concern always exists because even the strict form "10.1.2.3" could be a registered name. The loose form is only changing how this ambiguity is resolved. Given that different platforms might be resolving this ambiguity differently, perhaps we can never make things worse on this point. The URI standard (RFC 3986) has a rule based on the strict form, and that's why I think it's worthwhile to have a function for the strict form.

  2. The second concern is that if an application stupidly assumes that it can filter IPv4 addresses (e.g., for protecting some internal network or checking whether it's 127.0.0.1) using naive string comparison under the assumption that the net package only recognizes the strict form, the filtering can now fail. An easy "fix" is to normalize IPv4 addresses first, which means the net package should keep its current implementation of String without doing any "compression" as for IPv6 addresses, even if certain "compression" can be allowed in the loose form.

tebeka commented 11 months ago

The above PR is my understanding of the issue.