libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.56k stars 273 forks source link

identify: clarify that `listenAddrs` should be external addresses #597

Closed thomaseizinger closed 7 months ago

thomaseizinger commented 10 months ago

When taken literally, the spec currently demands to include the listen addresses in this field. That is pretty useless unless the node is listening on an interface that is directly connected to the public Internet.

In most cases, a node will be behind NAT which might have port-forwarding configured to be externally reachable. In either case, the local listening addresses are not going to help other peers in connecting.

Please consider this change in isolation of other PRs / issues around the identify spec. Ideally, the entire spec would be overhauled because it e.g. doesn't mention peer records (also see #347). However, not all implementations support peer records in identify and effort on extending the spec has stalled due to various issues.

It would be great if we could just clarify this one element in isolation without opening the can of worms that the spec overhaul is. Thank you!

cc interest group: @vyzo @yusefnapora @tomaka @richardschneider @Stebalien @bigs

Related: https://github.com/libp2p/rust-libp2p/issues/4010.

tomaka commented 10 months ago

I've always been extremely reluctant to do that because it kind of breaks private networks (say, running a network of several nodes on your localhost machine). The fact that it breaks private network isn't super problematic by itself. What is problematic is that is breaks private networks silently. Anyone trying to run a private network will break their teeth trying to figure out why it's not working.

The text in this PR says "the addresses on which the peer is deemed to be reachable by other peers", but for this reason that is not the same thing as external addresses, like the PR implies it is.

One way to solve this could be to still return local addresses if the peer that is sending the identify request is local as well. However, that might interact very badly with things such as relaying, and I'm personally in favor of the requesting side of identify filtering out the addresses that it can't reach. Having a protocol be simple and deterministic is very much desirable.

Besides saving a few hundred bytes of bandwidth every minute, which really seems more than negligible to me, what is the motivation behind this change?

marten-seemann commented 10 months ago

I agree with @tomaka. What "external" means should depend on what peer you're talking to. In fact, we implemented this in go-libp2p: https://github.com/libp2p/go-libp2p/pull/2300 (and some follow-up PRs).

Using listen addresses is bad terminology though. You could be listening on 0.0.0.0, but that's never an acceptable address to advertise (127.0.0.1 would be, if you're connected to another node on localhost).

b-zee commented 5 months ago

I'm personally in favor of the requesting side of identify filtering out the addresses that it can't reach.

I think that is indeed good design. Each node has the responsibility to filter out nonsense. No information can be trusted, so we verify.

Besides saving a few hundred bytes of bandwidth every minute, which really seems more than negligible to me, what is the motivation behind this change?

One aspect I do not see mentioned is privacy/security. Members of our community do not like leaking their private networking information. My node gives out a bunch of listen addresses, like my WLAN, Ethernet, and private VPN subnet setup, etc.

I think the spec should at least acknowledge the possibility of wanting to be selective with what is shared.

MarcoPolo commented 5 months ago

Let's try not to continue discussion on this closed PR, but I'll respond here since I'm hoping I can solve your problem.

One aspect I do not see mentioned is privacy/security...

I don't think this is an issue the spec needs to define. Although I could see a version of the spec making recommendations.

This sounds like an implementation decision. Maybe bring up the ability to filter the advertised listenAddrs as a feature request to your preferred libp2p implementation? It seems like a reasonable request that wouldn't be hard to implement.