libp2p / go-libp2p-kad-dht

A Kademlia DHT implementation on go-libp2p
https://github.com/libp2p/specs/tree/master/kad-dht
MIT License
519 stars 222 forks source link

Allow the ProviderManager to have more paralleism #729

Open aschmahmann opened 3 years ago

aschmahmann commented 3 years ago

The ProviderManager has a single event loop for managing all requests (puts, gets, etc.) https://github.com/libp2p/go-libp2p-kad-dht/blob/06918c87f618c38cb8b5819a31d9c475e154f7ee/providers/providers_manager.go#L113

The major operations in the event loop such as add and get provider block the event loop and can potentially take a long time (e.g. a network based datastore lookup) https://github.com/libp2p/go-libp2p-kad-dht/blob/06918c87f618c38cb8b5819a31d9c475e154f7ee/providers/providers_manager.go#L135-L153

This would lead to the provide manager getting backlogged and a slow down in network responses.

If instead we allow for parallelism on the calls happening within the event loop, such as allowing many adds or gets happening at the same time, then we'd be enabling users to reduce response latencies by increasing the resources they use.

Stebalien commented 3 years ago

Also related: https://github.com/libp2p/go-libp2p-kad-dht/issues/675. But yeah, given high latency blockstore operations, this should be parallelized.

Stebalien commented 3 years ago

Note: parallelism is a problem for gets, not puts (ish). The datastore is an "auto batching" datastore, so puts are effectively done in parallel.

I say "ish" because we'd need to have a "put" queue to make this fully parallel, otherwise we'll block every time we flush.

Stebalien commented 3 years ago

I guess the main issue here is that gets are blocking puts.

aschmahmann commented 3 years ago

Agreed. Generally the "event loop" pattern tends to fall apart when items inside the loop are blocking, so it'd be nice to have worker pools to keep the event loop clear.