lightninglabs / neutrino

Privacy-Preserving Bitcoin Light Client
MIT License
894 stars 182 forks source link

Don't hammer disconnecting peers #299

Closed buck54321 closed 3 months ago

buck54321 commented 3 months ago

A peer disconnect can trigger immediate reconnects to the same peer regardless of the nature of the disconnect.

Starting at (*Peer).Disconnect, closing the quit channel causes WaitForDisconnect to return in peerDoneHandler, which sends a message to handleDonePeerMsg. For non-persistent peers, this can cause two simultanous calls to GetNewAddress. Once more directly via NewConnReq, and once via (*ConnManager).Remove sending a handleDisconnected to the (*ConnManager).connHandler and (*ConnManager).handleFailedConn, which calls NewConnReq again. So a disconnecting peer can trigger two new connection requests immediately, and there's nothing preventing the same address from being used.

The newAddressFunc used by *server in btcd may also benefit from this change. I think it might be the reason for the // TODO: duplicate oneshots? comment.

Chinwendu20 commented 3 months ago

If I understand correctly, you are saying that disconnecting a peer can lead to immediate reconnection. You went further to state that this connection request happens twice for non persistent peers as disconnecting them would lead to a call to GetNewAddressFunc twice?

  1. As a result of the call to the Remove method of the connmanager
  2. Then here as well when a NewConnReq method is called directly

Right? For the first case, this specifically occurs during a call to the handleFailedConn here

Are my following correctly?

buck54321 commented 3 months ago

Yes. That's correct.

Chinwendu20 commented 3 months ago

Thanks, I do not think the first case happens though because of this conditional: https://github.com/btcsuite/btcd/blob/6b197d38d745048c5fe2a895010c9c0a4d9ab2a6/connmgr/connmanager.go#L319-L322

The Remove method sets the retry bool as false: https://github.com/btcsuite/btcd/blob/6b197d38d745048c5fe2a895010c9c0a4d9ab2a6/connmgr/connmanager.go#L489

buck54321 commented 3 months ago

Oooh. OK. Hmm. I'm seeing little clusters of calls to the same address. Up to 5 at a time unmetered on a loosely connected testnet. I've traced the clusters to mostly handleFailedConn and handleDonePeerMsg buy maybe I'm misunderstanding the nature. Lemme look a little closer.

buck54321 commented 3 months ago

I can't reliably reproduce. Will reopen if I figure something out.