paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.63k stars 570 forks source link

Request/Response adaptive timeouts #778

Open rphmeier opened 1 year ago

rphmeier commented 1 year ago

At the moment, the maximum timeout and response size are set for the entire request/response protocol. In fact, when the request is made with enough context it should be possible to configure the maximum timeout and response size accordingly. This would let us write more sophisticated networking protocols.

eskimor commented 1 year ago

What are the exact goals - as in why is the timeout important/needed? The timeout in substrate is a hard cap, if it is hit we will cancel the existing download even if it already reached like 99% - wasting all the effort. The timeout has the potential of completely crippling a protocol if set too tight (no peer able to provide the data within the timeout).

What we used in the past, e.g. in availability-recovery is the notion of a soft timeout, when hit, we would start additional parallel requests, but leave the old ones running. Would that be applicable here?

eskimor commented 1 year ago

Depending on the exact requirements, another option is also to play with queue sizes. For good distribution of load, if queue sizes on honest nodes are relatively small, we will get an error immediately when the peer is under load and don't have to waste time waiting for the timeout. In this scenario the timeout mostly exists to minimize harm malicious peers*) can have, hence we should be able to make it relatively generous.

*) and long latency between two particular peers.

burdges commented 1 year ago

We could've subchunks in availability, complete with deeper merkle proof, if parablock size were every problematic for downloads.

rphmeier commented 1 year ago

Yes, we can build higher-level timeout logic on top. The main goal is to do stuff like exponential back-off on requests and attempts and start with low timeouts with certain peers and move to higher ones.

eskimor commented 1 year ago

Ok, I was actually aiming at one level deeper. Why do we want that exponential back-off, starting with low timeouts?

rphmeier commented 1 year ago

It may be useful in paritytech/polkadot#5999, for instance. Exponential back-off or other back-offs are useful in general as a tool in networking protocols, so it's good to make sure the low-level code can support such things.