paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.com/
1.93k stars 710 forks source link

authority-discovery: Publishing records strategy on DHT failures #3823

Open lexnv opened 8 months ago

lexnv commented 8 months ago

The authority-discovery will publish the DHT records (containing Ip addresses) in the following manner:

This at the moment does not handle the DHT failures at all. And there's been a problem with resetting the DHT timers which always advanced the republished (1h timers) on success, causing the DHT records not to be republished: https://github.com/paritytech/polkadot-sdk/pull/3764.

The proposed strategy for publishing records sooner:

This strategy aims to advertise DHT records more aggressively and gracefully handle DHT failures only after a threshold.

Would love to hear your thoughts on this 🙏

cc @paritytech/networking @bkchr @alexggh

dmitry-markin commented 8 months ago
  • Keep track of how many records are successfully and unsuccessfully published

    • If failure rate > 30%: publish again at 10 minutes intervals

What do you mean by 30% failure rate? The quorum currently is 100% of the replication factor: https://github.com/paritytech/polkadot-sdk/blob/e88d1cb79315792a3dbccb6bdef2543093ecaf5b/substrate/client/network/src/discovery.rs#L404

  • Once every hour (or when keys change)

    • publish 3 times the records at 1minute, 2minutes 4minutes

If we successfully published a record, i.e., 20 closest peers were reached, I don't think there is a point in repeating the publishing.

lexnv commented 8 months ago

Missed that, I was under the impression we get one notification per peer, thanks for the info 🙏