golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.88k stars 17.52k forks source link

net: Dial only tries one address, fails on IPv6-only system if that addr is IPv4 #8124

Closed gopherbot closed 9 years ago

gopherbot commented 10 years ago

by google@barrera.io:

$ go version
go version go1.2.2 linux/amd64
$ cat test.go 
package main

import "net"
import "fmt"

func main() {
        conn, err := net.Dial("tcp", "google.com:80")
        if err != nil {
                fmt.Println(err)
        }
        fmt.Fprintf(conn, "GET / HTTP/1.0\r\n\r\n")
}
$ go run test.go
dial tcp 173.194.42.72:80: network is unreachable

---

This is happening on an IPv6-only host, so there should be no attempt to connect to the
above IPv4, and instead, try the IPv6 (eg: the AAAA record) one instead.
mikioh commented 10 years ago

Comment 1:

Status changed to Duplicate.

Merged into issue #5707.

mikioh commented 10 years ago

Comment 2:

Not really.
http://golang.org/pkg/net/#Dialer
https://groups.google.com/d/msg/golang-nuts/pLALjlQ1aTI/3pN0isO9pSwJ

Status changed to Retracted.

gopherbot commented 10 years ago

Comment 3 by google@barrera.io:

> issue #5707
This mentions broken test scenarios, this ticket is about end-user applications
themselves failing due to a bug in the core libraries. It may due to a related bug,
though not necesarily. Also, my kernel *is not* IPv6-only, but my networks have no IPv4
addresses (except for 127.0.0.1).
> https://groups.google.com/d/msg/golang-nuts/pLALjlQ1aTI/3pN0isO9pSwJ
This simply seems to provide workarounds, usable by devs in the meantime, but the core
libraries still need fixing.
mikioh commented 10 years ago

Comment 4:

Sorry, referring issue #5707 is a misoperation.
> but the core libraries still need fixing.
Well, I don't understand what you are saying. Can you please elaborate on your use cases
and bugs need to be fixed. For example if you are saying about a bit smart stuff,
neither bump-in-the-stack nor bump-in-the-api approach, like bump-in-the-host approach:
http://tools.ietf.org/html/rfc6535, please explain which feature is required for which
package. Otherwise, we can do nothing for you.

Status changed to New.

gopherbot commented 10 years ago

Comment 5 by google@barrera.io:

I don't know how to make this clearer. :(
Basically, the below code should not fail if I don't have IPv4 addresses. In particular,
my network has NAT64, so this should even work with IPv4-only remote hosts, since the
DNS64 will return proper AAAA records.
I'm not sure how other languages/libraries handle this (maybe they check for the lack of
IPv4-routes?). Go seems to be attempting to communicate to the A record every time,
regardless of network configuration.
$ cat test.go 
package main
import "net"
import "fmt"
func main() {
        conn, err := net.Dial("tcp", "google.com:80")
        if err != nil {
                fmt.Println(err)
        }
        fmt.Fprintf(conn, "GET / HTTP/1.0\r\n\r\n")
}
$ go run test.go
dial tcp 173.194.42.72:80: network is unreachable
mikioh commented 10 years ago

Comment 6:

> I don't know how to make this clearer. :(
Looks like it's been discussed from the birth of IPv6 and we still have no concrete
solution yet. Basically the problem contains several research area as the following:
- for layer7-3: appropriate namespace resolution including fighting against leaky
abstractions,
- for layer7-3: connectivity and reachability measurement with developing useful metrics,
- for layer4-3: a few clever techniques for traversing various address/protocol
translation techniques-deployed islands.
> I'm not sure how other languages/libraries
I don't know other libraries that behave good enough. I guess they recommend using a
combo of getaddrinfo and connect. Its equivalent in Go is a combo of LookupIP and Dial
with literal address.
> the below code should not fail if I don't have IPv4 addresses
> net.Dial("tcp", "google.com:80")
I don't think so, because, a) the internet is always broken somewhere, it's hard,
pointless to provide a diehard API, b) your node configuration is your matter, API
doesn't care, c) API already provides several control knobs; for example, you can pick
your favorite one from:
1) "tcp"+"www.google.com:http"
2) "tcp"+"173.194.117.212:http"
3) "tcp"+"[2404:6800:4004:804::1010]:http"
4) "tcp4"+"www.google.com:http"
5) "tcp4"+"173.194.117.212:http"
6) "tcp6"+"www.google.com:http"
7) "tcp6"+"[2404:6800:4004:804::1010]:http"
8) DualStack=true+"tcp"+"www.google.com:http"
> (maybe they check for the lack of IPv4-routes?)
I think dealing with connected routes and/or assigned routable addresses doesn't make
sense. How could we know that address belongs to which line: expensive LTE mobile line,
secure and censored company-VPN line, or dark/honeypot stuff. Instead you can implement
your own smart stuff that gathers environment information and selects the situational
path, and connect to the resources using net.Dial.
> Go seems to be attempting to communicate to the A record every time, regardless of
network configuration.
The current standard library prefers IPv4 than IPv6 when users are not sure which
address family is good.
mikioh commented 10 years ago

Comment 7:

Labels changed: added repo-main.

Status changed to LongTerm.

gopherbot commented 10 years ago

Comment 8 by google@barrera.io:

> I don't think so, because, a) the internet is always broken somewhere, it's hard,
pointless to provide a diehard API, b) your node configuration is your matter, API
doesn't care, c) API already provides several control knobs; for example, you can pick
your favorite one from:
a) But other languages do so. But not fixing this, I, as an end user find that
consistenly go application simply "won't work", with nothing I can do about them.
c) Yes, my node is configured properly. Only go (and nodejs) applications fails. Every
other library and language out there has managed to fix this.
c) *I* can't. Not as an end user. Developers can, but I can't go around telling devs of
dozens of different applications to start working around issues that could be fixed in a
core/library they all share in common.
> I think dealing with connected routes and/or assigned routable addresses doesn't make
sense. How could we know that address belongs to which line: expensive LTE mobile line,
secure and censored company-VPN line, or dark/honeypot stuff. Instead you can implement
your own smart stuff that gathers environment information and selects the situational
path, and connect to the resources using net.Dial.
That would be secondary. I think it's better if applications accidentally use a slower
connection rather than simply fail with no possible workaround. If I have *no* IPv4
routes, it's safe to assume IPv6 should be defaulted.
Why not just try the AAAA record if the A one failed? Once both fail, consider it a
failure. That's rather simple, and would not add any new issues or regressions.
mikioh commented 10 years ago

Comment 9:

Again, please elaborate on your issue, use cases, which feature is required for which
package if possible.
You are only saying "should not fail", "other languages do something", or "why not just
try". You never explain the details of the environment your issue happens. We can guess
that;
- perhaps you can get both A+AAAA records to the resource,
- perhaps your environment doesn't deploy DNS64-like namespace resolution mechanism,
- perhaps you have IPv6 transport,
- perhaps you don't have IPv4 transport, even it's private address's,
- perhaps your environment doesn't deploy NAT64/464XLAT/MAP-T-like transport translation
mechanism,
- perhaps your environment doesn't deploy 6to4/6rd/Terado/ISATAP/MAP-E-like transport
tunneling mechanism,
- perhaps your node implements and runs both IPv4 and IPv6 stacks,
- perhaps your gai.conf or other configration prefers IPv6 address family,
- perhaps your go standard library is built with CGO_ENABLED=1 (no use of builtin DNS
resolver).

Then, when you run net.Dial("tcp", "google.com:80") in your application;
- glibc or similar's getaddrinfo uses IPv6 transport for DNS queries,
- net.LookupIP returns A+AAAA records,
- your application doesn't use Happy Eyeballs-like solution; i.e.. net.Dialer{DualStack:
true},
- net.Dial attempts a single call to syscall.Connect with A record,
- net.Dial returns "network is unreachable" error.
But all above are just my dumb guesses. Even if those dumb guesses are about right,
there are several solutions on several areas you can use. What we, at least to me, need
is just addressing your issue first, then developing a solution considering if it
requires an API change it would not be an API breaking change to keep the promise of API
compatibility; http://tip.golang.org/doc/go1compat.
> Only go (and nodejs) applications fails.
Interesting, can you show us, explain how other language or its standard library can
resolve this issue?
gopherbot commented 10 years ago

Comment 10 by google@barrera.io:

Oh my, I never noticed this last comment. Sorry for the delay replying.
> Again, please elaborate on your issue, use cases, which feature is required for which
package if possible.
I expect this code to work:
    net.Dial("tcp", "google.com:80")
On a system that has the network connectivity to successfully run:
    curl google.com
----
This was actually pointed out to me recently, when coming across another golang
application: http://golang.org/src/pkg/net/ipsock.go#L75
The library prefers IPv4 and does not retry anything. If one fails, the next address
should be tried. Or at least an address of a different family.
Also, it doesn't even care which address the system resolver returns first (mine prefers
IPv6 results).
Sorry that I can't really contribute more in-depth details: I'm not even a go developer,
I'm an end user who has consistently found that applications developed in golang
constantly fail to open any network connections, but I'm in no way proficient with the
languages or it's API. All I percieve (as an end user) is that applications developed in
go "don't have network capabilities" on IPv6 nodes (which includes all my home
computers).
> Interesting, can you show us, explain how other language or its standard library can
resolve this issue?
I honestly don't have the understanding to follow such low-level code in most languages.
I belive that some actually retry different addresses before giving up. Eg: if the
resolver returns IPv4 and IPv6 addresses, they're all tried, not just one and give-up.
mikioh commented 10 years ago

Comment 11:

thanks for the comment. if i understand correctly, you don't need the fix which covers
entire leaky abstractions in networking stuff; btw transport layer and network layer
namespaces, btw ipv4 and ipv6 connectivities blah blah and what you really need is just
to fix issue #8453 (and 8455), right?
if so, please give your start to issues: 8453 and 8455, thanks.
gopherbot commented 9 years ago

Comment 12 by hugoosvaldobarrera:

After carefully looking at those issues (#8453, #8455), I just noticed that they won't
fix the problem here since they both refer to dual-stacked hosts. This scenario is on
IPv6-only hosts.
To be more precise, here's the exact problem:
The following code is rather common in go-based applications.
    $ cat test.go 
    package main
    import "net"
    import "fmt"
    func main() {
            conn, err := net.Dial("tcp", "google.com:80")
            if err != nil {
                    fmt.Println(err)
            }
            fmt.Fprintf(conn, "GET / HTTP/1.0\r\n\r\n")
    }
    $ go run test.go
    dial tcp 173.194.42.72:80: network is unreachable
My host has no IPv4 connectivity (but my network *does* have NAT64):
    $ ifconfig wlan0
    wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
            inet 169.254.234.197  netmask 255.255.0.0  broadcast 169.254.255.255
            inet6 fe80::bae8:56ff:fe18:7bf6  prefixlen 64  scopeid 0x20<link>
            inet6 2800:40:7aa:1:bae8:56ff:fe18:7bf6  prefixlen 64  scopeid 0x0<global>
            ether b8:e8:56:18:7b:f6  txqueuelen 1000  (Ethernet)
            RX packets 125144  bytes 154240526 (147.0 MiB)
            RX errors 0  dropped 0  overruns 0  frame 10925
            TX packets 92115  bytes 15505924 (14.7 MiB)
            TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
            device interrupt 18
The above code should not return an error, since the remote host *is* reachable:
    $ curl -I www.google.com
    HTTP/1.1 302 Found
    Cache-Control: private
    Content-Type: text/html; charset=UTF-8
    Location: http://www.google.com.ar/?gfe_rd=cr&ei=mqGAVNS1J8aFwASZjoCoAw
    Content-Length: 262
    Date: Thu, 04 Dec 2014 18:02:02 GMT
    Server: GFE/2.0
    Alternate-Protocol: 80:quic,p=0.02
As-is common go applications have no network capabilities on ipv6-only hosts when using
net.Dial, which is commonly used. There's little motivations for developers out there to
use LookupIP and Dial, because the documentation of net.Dial seems to indicate that the
effect would be the same.
We also can't have all this sort of low-level code repeated across every application out
there - these things are usually handled by the language core and/or the networking
library.
pmarks-net commented 9 years ago

I agree that Issue #8453 will fix this problem, provided that the solution is to call getaddrinfo() and loop over the resulting addresses.

On an IPv6-only client, getaddrinfo will sort the IPv6 address first. Even if it doesn't, the lack of default route will cause IPv4 connections to fail, so the loop will switch to IPv6 almost immediately.

rsc commented 9 years ago

Closing as duplicate of #8453.

WhyNotHugo commented 9 years ago

Not quite the same as #5707.
This is not an IPv6 kernel, I've IPv4 loopback, and some of my devices have local-link IPv4 (eg: 169.254.103.45).

I only have IPv6 internet connectivity, and my OS is set to always prefer IPv6.

The difference lies in that detecting a lack of IPv4 capabilities would fix #5707, but not this issue.

mikioh commented 9 years ago

@hobarrera,

See #8847. For go1.4 and below, you can use net.Dialer{DualStack: true}.Dial(...).

WhyNotHugo commented 9 years ago

For go1.4 and below, you can use net.Dialer{DualStack: true}.Dial(...).

Like I said on the original issue in google-code, this soft of advice does not apply to me. I'm not the developer of the failing applications, but merely a user.

Having understand, track down, and manually change the source and manually build any software that I use is not an option.

See #8847.

8453 (to which #8847 refers) would seem to fix the issue (and finally, go application can communicate over the Internet!).

I wonder though: Why is this labelled LongTerm? Internet connectivity seems as critical as can be in 2015.

mikioh commented 9 years ago

@hobarrera,

This issue is already merged into #8453. If you want to discuss your opinion, use case, whatever, please use golang-nuts.