libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.58k stars 275 forks source link

AutoNAT: Clarify cases in which response status `E_DIAL_ERROR` should be returned #411

Closed elenaf9 closed 2 years ago

elenaf9 commented 2 years ago

Follow up on https://github.com/libp2p/rust-libp2p/pull/2618#discussion_r867529638.

Concerning the ResponseStatus E_DIAL_ERROR, the AutoNAT spec currently only states that:

If all dials fail, the receiver sends a DialResponse message with the ResponseStatus E_DIAL_ERROR.

However, the Go implementation also returns DialErrors without even trying to the dial a remote peer. Concretely, this happens when skip_dial returns true for the client's observed address, which is the case if it is e.g. a relayed or private address. I don't think that this should count as an dial error since the server never actually tried to dial the client. Instead a DialRefused should be returned (which would also match the returned statustext "refusing to dial peer with blocked observed address"_).

If folks disagree and are in favor of returning a DialError, this should be part of the spec.

marten-seemann commented 2 years ago

I agree that it should be a DialRefused, the DialError has caused us a lot of headache in the past: It's not possible to give your peer a single address, and just probe that single address.

We can clarify this in the spec, however, we can't change the way that already deployed Go implementations behave. This means that probing a single address won't be possible until all nodes have upgraded (which, at the current speed of upgrades in the IPFS network means a couple of years). This was one of the reasons for Autonat v2 proposal.

elenaf9 commented 2 years ago

I think if we agree that DialError should only be returned if an actual dial attempt was made then the current specs are fine. I just wanted to avoid that a DialError it is returned for a case that is not specified. I'm happy to do a PR to go-libp2p to change it there (so that at least the upgrading nodes are aligned with this), unless you think this could cause problems in the IPFS network?

mxinden commented 2 years ago

I'm happy to do a PR to go-libp2p to change it there (so that at least the upgrading nodes are aligned with this)

I am in favor of this.

marten-seemann commented 2 years ago

@elenaf9 Let's do it! And add a comment that we can't rely on this error code since all versions before v0.20.0 sent the wrong error code.

MarcoPolo commented 2 years ago

It's not possible to give your peer a single address, and just probe that single address.

I don't follow. Wouldn't you be able to set the dial message to have only one address?

I think this change could be backwards compatible since older nodes only check if the response was OK or not. https://github.com/libp2p/go-libp2p/blob/393e3518b397a1ae88fd825c85f892956868543f/p2p/host/autonat/client.go#L79

marten-seemann commented 2 years ago

It's not possible to give your peer a single address, and just probe that single address.

I don't follow. Wouldn't you be able to set the dial message to have only one address?

I should have been more precise. A positive response will be actionable, but when you receive a DialError, you don't know if that's because the peer tried to dial you and failed, or the peer just didn't support the transport (for example).

elenaf9 commented 2 years ago

Closing this since we agree that there is only one case in which E_DIAL_ERROR should be returned, which is the one that is already covered in the spec. With libp2p/go-libp2p#1527 and https://github.com/libp2p/rust-libp2p/pull/2618 this is now also changed in the Go and Rust implementations.