libp2p / go-libp2p

libp2p implementation in Go
MIT License
6.1k stars 1.07k forks source link

swarm: better backoff logic #1554

Open Stebalien opened 7 years ago

Stebalien commented 7 years ago
  1. We should try to distinguish between local failures and remote failures. At the very least, we should be resetting our backoffs when new links/routes come online.
  2. We should probably be backing off on a per multiaddr basis, not a per peer basis (unless we establish a connection to the peer and it tells us to to away (need a new protocol for that, related to https://github.com/libp2p/go-libp2p/issues/238).

Came up in: https://github.com/libp2p/go-libp2p-kad-dht/issues/96

mishto commented 6 years ago

Can we expose baseBackoffTime and maxBackoffTime? the default values are arbitrary and different applications may want different settings.

Stebalien commented 6 years ago

Fair enough. Also, it looks like our backoff aren't actually exponential...

Stebalien commented 6 years ago

This will be fixed in large refactor/simplification that's coming down the pipe.

Stebalien commented 6 years ago

Note to self: Refund backoff "tries" after a period of time. Currently, if we go to max-backoff, wait an hour, and then fail a single dial, we'll wait the max backoff again. We should, instead, notice that an hour has passed and forget all the previous failures.

Code:

    now := time.Now()
    if sinceLast := now.Sub(bp.until); sinceLast > 0 {
        // Refund backoff time at the same rate.
        refund := int(math.Sqrt(float64((sinceLast - BackoffBase) / BackoffCoef)))
        if refund < bp.tries {
            bp.tries -= refund
        } else {
            bp.tries = 0
        }
    }

Not going to do this now because we have so many other changes in the pipeline and we may want to discuss this.

mishto commented 6 years ago

Sounds good, thanks.

Stebalien commented 4 years ago

Working through all the different backoff cases:

Stebalien commented 4 years ago

Status: While @petar's patches are likely the right way to go in the future, they introduce quite a few new interfaces that'll need to be discussed. In the interest of getting a fast fix in, @willscott is implementing (#191) a dumb version that just backs off full addresses inside the swarm itself without changing core libp2p interfaces.

That gives us some breathing room.