celestiaorg / celestia-core

A fork of CometBFT
Apache License 2.0
491 stars 270 forks source link

Investigate initial timeouts on DHT / DAS #377

Closed liamsi closed 3 years ago

liamsi commented 3 years ago

Summary

We (me, and later @Wondertan confirmed my observation) observed the following behavior: when spinning up a lazyledger validator node on digital ocean and starting a light client locally, DAS for the light client times out.

We currently work around adding the fullnode's IPFS multiaddress to the light client's bootstrap nodes but it is important:

Wondertan commented 3 years ago

So after extensive investigations, the issue with DAS content resolution timing out appeared relatively trivial but not straightforward to fix.

TL;DR; Block production is much faster than DHT providing, leading to constant scattering between DASing node and network.

Inveestigation

Key Takeaways

Solutions

Proper

The most valuable contributor to problem resolution is potentially https://github.com/lazyledger/lazyledger-core/issues/378, where we store DAHeader in IPFS and rely on one DataHash instead of multiple roots. Furthermore, we should start synchronously providing DataHash to meet the first takeaway and the DataHash only to satisfy the second one. However, we may also consider keeping asynchronous providing for all the remaining nodes and leaves in the background to contribute to subtrees observability in case. Optimistically, the providing operation should end before the next turn for a node to propose and start providing again.

Fortunately, the recently discussed topic of the new DHT update comes into place to help here as well. More info here. It also contributes to the third takeaway by introducing a new DHT node type with full routing tables that can help short-circuit long queries. Although, I need to look more deeply into the implementation to understand all the features and possible tradeoffs before relying on it.

Quicker for MVP

The proper solution would take too much time for the MVP. Thus we need to come out with something short-term and desirably not time wasteful:

Wondertan commented 3 years ago

Even after testing with disabled Bitswap providing, we won't achieve ~30secs for an announcement of all DAH roots with max block size(yet to be proven). Thus, the issue remains.

I can now confirm that just with manual sync providing and with all the IPFS/Bitswap async providing disabled, I still get similar and impractical ~3mins to announce 32 roots to the networks

Wondertan commented 3 years ago

For the MVP case, we can also rely on rows only. The workaround decreases providing time to a half(~1.5min). I observed that practically.

Wondertan commented 3 years ago

More info and explanation regarding the mentioned new DHT client in the proper solution and how it can be helpful to solve this specific case with long-lasting providing. To understand why it is helpful we should understand how it and regular client work.

Let's start with an explanation of how kDHT searching works. Anybody who is reading this should imagine a circle of dots in buckets(groups) of k size(circle formation is out of context), where each dot is a network node storing some part of the global network key to value mappings. And when any dot in the circle wants to find/put some value for a key it:

  1. gets closest to the key dots within own bucket
  2. queries/request them for closest dots in their buckets
  3. executes 2 recursively until it finally reaches the dot with the value to get or set it.

So basic DHT client struggles from the requirement to do these multiple hops for closest dots and those hops are the main reason why it took so much time to provide/put something on the DHT.

New DHT client instead of just keeping some portion of key/values periodically crawls and syncs the whole network. This allows having 0 hops and to directly do set or get ops with dots. Comparably to blockchain state syncing, this DHT client also requires some time to instantiate and download the whole network state. Luckily, we already can rely on the practical results with providing time <3sec. Furthermore, the new client comes into place for the case with disappearing and unreliable DHT mediums, as it just remembers what they were providing preserving content discoverability. However, having a copy of the full DHT network/routing table on the node is not cheap, but proposers' interest aligns with fast providing and preserving solid content discoverability, so that's a valid tradeoff.

musalbas commented 3 years ago

Can we verify that after a node downloads data from a node that it has discovered via DHT, it will maintain a connection to that node using BitSwap?

liamsi commented 3 years ago

Edited the opening comment to reflect this sub-task and hidden our both comments to keep this focused.

Wondertan commented 3 years ago

to understand if we would see these timeouts during consensus too

setting this to done

liamsi commented 3 years ago

Thanks! Can you update your comment above to include a sentence or two about the result? Otherwise it is hard to see what the outcome of this was.

Wondertan commented 3 years ago

For our DHT case, we have Provide and GetProviders operations. On IPFS network Provide operation can take up to 3 mins what was the main cause of the issue, GetProviders can take up to 1 min, but often it takes less than 10 secs. For networks less than 20 nodes, both operations should take less than 10 secs, as bucker size is 20 and no hops are expected.

@liamsi, those timings are mostly inevitable and they are applied on any case, so if used with consensus they would present as well. Good for us, we decided to go with push approach.

Closing this, further work and info is now here: https://github.com/lazyledger/lazyledger-core/issues/395