golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.84k stars 17.51k forks source link

proposal: net/v2: Listen is unfriendly to multiple address families, endpoints and subflows #9334

Open pmarks-net opened 9 years ago

pmarks-net commented 9 years ago

The following example program:

package main

import (
        "net"
)

func main() {
        net.Listen("tcp", "localhost:8080")
        select{}
}

Currently yields this result:

$ netstat -nl | grep 8080
tcp        0      0 127.0.0.1:8080          0.0.0.0:*               LISTEN

While the following result would be optimal:

$ netstat -nl | grep 8080
tcp6       0      0 127.0.0.1:8080          :::*                    LISTEN
tcp6       0      0 ::1:8080                :::*                    LISTEN

(Note that the first socket is actually dualstack, and bound to ::ffff:127.0.0.1, but that's less critical than adding the second socket bound to ::1.)

More generally, when you call net.Listen() on a hostname which resolves to multiple IPv4/IPv6 addresses, only the first IPv4 address is selected. An analogous problem occurs if you Listen("tcp", ":8080") on an operating system that doesn't support dualstack sockets: instead of returning a pair of sockets bound to [::]:8080 and 0.0.0.0:80, you only get IPv4.

The fundamental flaw is that Listen() assumes a single socket, which is a leaky abstraction that's inappropriate for high-level things like example servers, e.g.: http://golang.org/pkg/net/#pkg-overview

Go should either adapt the Listen() API to support multiple sockets, or if that's not feasible, a new multi-socket API should be introduced, which deprecates Listen() for all cases except simple non-wildcard addresses.

bradfitz commented 9 years ago

/cc @mikioh

minux commented 9 years ago

The correct way to listen for both tcp4 and tcp6 is to leave out the hostname part. net.Listen("tcp", ":8080")

Will listen on tcp6 if it's dual-stack system.

I don't think net.Listen should automatically create multiple listener for each IP that the hostname resolve to, that is a higher-level thing that the client should take care of.

For example, what if the machine added another network interface and another ip, should the net package track that and automatically listen on the newly added IP? This is a decision that the client program should make, not net package's.

pmarks-net commented 9 years ago

net.Listen("tcp", ":8080") ... Will listen on tcp6 if it's dual-stack system.

That's not entirely accurate. There do exist dual-stack systems which don't support dual-stack sockets, so AF_INET and AF_INET6 must be kept separate. In that case, Listen falls back to IPv4, which is suboptimal.

for each IP that the hostname resolve to, that is a higher-level thing

If Listen didn't accept hostnames then I'd agree with you, but hostnames are accepted, and with great power comes great responsibility. Arbitrarily picking one IP address just isn't right. If the "prefer IPv4" bug were fixed, then localhost listeners would flip to ::1 only, which would surprise a lot of people.

minux commented 9 years ago

What should happen if more IPs are added to a hostname? Should the net package automatically listen on those newly added addrs?

No matter what we do, I think the behavior will surprise some users.

Also, what do you think should the Listener's Addr() method return for a listener that listens on multiple addresses? Return the hostname is not entirely correct. Because when the Addr() is called, hostname might resolve to a different set of addresses. Also note that Addr represents a network end point address, i don't think a hostname fits here.

IMHO, the existing documentation actually precludes listening on multiples addresses by a single listener.

I always think supporting hostname in Listen is just a convenience, and if you want absolute control, you should use explicit address.

pmarks-net commented 9 years ago

what do you think should the Listener's Addr() method return for a listener that listens on multiple addresses?

I don't have a good answer, and that's the fundamental flaw I was referring to. Since there's nothing sane for Addr() to return, it may be necessary to create a new MultiListener, and migrate the prominent documentation and sample code.

What should happen if more IPs are added to a hostname?

Perhaps have a type that stores a resolved AddressSet, and another type that holds the listening sockets, where you can Read the AddressSet (to see which sockets exist) or Write it (to open new sockets and close obsolete ones). Then sufficiently-crazy users could resolve multiple hostnames, merge the results together, and dynamically update the pool of sockets without interrupting existing listeners.

But I don't care strongly about the dynamic stuff; I just think listening on localhost or ::+0.0.0.0 should be easy.

The problem is that net.Listen() is easy, popular, and wrong. The alternative (keeping track of multiple listening sockets, with dual-stack behavior that varies by OS) is so much more involved that people avoid doing the right thing, and IPv6 compatibility suffers as a result.

minux commented 9 years ago

I wouldn't call the current net.Listen wrong. It just can't handle every possible cases. And I don't think a dual-stack system without IPv4-mapped IPv6 support is the common case. (Even if we add such support, how could we test it on the builders?)

See also #8124, which has more in-depth discussion of the prefer-IPv4 issue. Supporting all possible configurations is just an impossible task.

pmarks-net commented 9 years ago

Even if we add such support, how could we test it on the builders?

I encountered the same question when working on a C networking library, and the solution was to funnel the IPV6_V6ONLY calls through a function with a test-only flag that emulates an OS without dualstack sockets:

/* Returns 1 on success (dual-stack), 0 on failure (single-stack). */
static int set_socket_dualstack(int fd) {
  if (!forbid_dualstack_sockets_for_testing) {
    const int off = 0;
    return 0 == setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &off, sizeof(off));
  } else {
    /* Force an IPv6-only socket, for testing purposes. */
    const int on = 1;
    setsockopt(fd, IPPROTO_IPV6, IPV6_V6ONLY, &on, sizeof(on));
    return 0;
  }
}

Granted, a flag is a bit unsightly, but it's an easy way to ensure coverage of the :: to ::+0.0.0.0 fallback logic.

Supporting all possible configurations is just an impossible task.

I don't agree with that defeatist position; getaddrinfo() with a loop solves the problem for non-pathological cases, and is significantly better than "Pick one IPv4 address."

When a standard library exposes hostname-based socket interfaces, it's the library's responsibility to follow best practices, instead of taking shortcuts. "one hostname ⇒ one socket" is a shortcut that's not obvious to users of the library, but which seems harmful to the networking ecosystem.

Pretending that no code had yet been written, I don't think a Listen(hostname) design that arbitrarily picks a single IP address would stand up to logical scrutiny.

mikioh commented 9 years ago

We perhaps need to support this eventually. In that case the new stuff should support not only for dual-stack TCP listeners but for SCTP, MPTCP listeners.

Random thoughts on API:

Random thoughts on testing:

Random thoughts on roadmap:

References:

PS: Moreover, I'd want to see what happens with the IP Stack Evolution Program: http://www.ietf.org/proceedings/91/slides/slides-91-iab-techplenary-6.pdf

mattn commented 9 years ago

Is this related issue? #7598

mikioh commented 9 years ago

No, #7598 is a simple, Windows-specific investigation; how Windows dual IP stacks behave when we use it.

gopherbot commented 7 years ago

CL https://golang.org/cl/31931 mentions this issue.

psa commented 5 years ago

The correct way to listen for both tcp4 and tcp6 is to leave out the hostname part. net.Listen("tcp", ":8080")

A counterpoint to this, as it listens to all interfaces. I have a system which has multiple interfaces, each is dual stack and I need my application to listen to only a single interface. I got caught by issue as I used hostname:9100 (which resolves to IPv4 and IPv6 on the interface) as the most common behaviour for a networking stack which accepts a hostname is to listen on all addresses that the hostname references at application initialization.

Another approach would be to accept the interface name and port (e.g. bge0:9100) and listen to all IPs allocated to the interface at the time of initialization.

It's rare to see software which supports the IPs changing after application start, so no matter whether the name resolution or interface approach is taken, after initial resolution I would expect that it's up to the admin to either reconfigure or restart the application if those changes are required.

Obviously, having a way of telling the system to re-check every X seconds would be a nice to have, but that's far from expected behaviour, whereas resolving all IPs at start is expected behaviour (as can be seen by the large number of references to this issue).

thockin commented 4 years ago

Has anyone written a sane example of multi-listen?

tsavola commented 4 years ago

Has anyone written a sane example of multi-listen?

I wrote https://github.com/tsavola/listen, but I'm not too happy about the implementation.

justinclift commented 4 years ago

@tsavola With your implementation, what would need to be changed for you to be happy with it?

tsavola commented 4 years ago

@tsavola With your implementation, what would need to be changed for you to be happy with it?

Each acceptLoop goroutine accepts underlying connections before they are requested by the application; one or more TCP connections may be established before the application calls Accept. If the listener is wrapped by a connection limiter or such, there might be a long delay before the connection is actually passed to the application--or the connection might never be seen by the application.

I would be happy if the application's Accept call would directly cause the establishment of at most one TCP connection.

a-robinson commented 3 years ago

I don't expect to tilt the scales much compared to the very impressive selection of open source projects that appear to have run into this, but add my vote to the pile in favor of allowing binding on a hostname or on a network interface name to listen to all associated IPs. It shouldn't be so hard to listen on both an IPv4 and IPv6 address associated with a given hostname (without listening on all interfaces).

halturin commented 1 year ago

are there any updates on this issue? 8 years now.

llllvvuu commented 1 year ago

as the most common behaviour for a networking stack which accepts a hostname is to listen on all addresses that the hostname references at application initialization.

Do you have any examples of networking stdlibs that behave this way? I agree that the proposed behavior would be an improvement, but unfortunately Node.js doesn't have it either for example:

Welcome to Node.js v20.3.1.
> require('http').createServer((_, res) => res.end("hi")).listen(3000, "localhost")
; curl 127.0.0.1:3000
curl: (7) Failed to connect to 127.0.0.1 port 3000 after 6 ms: Couldn't connect to server
; curl localhost:3000
hi
; cat /etc/hosts
127.0.0.1       localhost
255.255.255.255     broadcasthost
::1                          localhost

nor does Python:

; python -m http.server --bind localhost    
Serving HTTP on ::1 port 8000 (http://[::1]:8000/) ...
; curl localhost:8000
...
; curl 127.0.0.1:8000
curl: (7) Failed to connect to 127.0.0.1 port 8000 after 6 ms: Couldn't connect to server

In Rust, the only HTTP server I can find which accepts hostname is Iron, which also resolves only to one interface:

; cd examples
; cargo run --bin hello
; curl localhost:3000
Hello world!%;
curl 127.0.0.1:3000
curl: (7) Failed to connect to 127.0.0.1 port 3000 after 6 ms: Couldn't connect to server