When walking an IPLD dag, the DagWalker prefetches the next 5-15 siblings of each node it fetches. Unfortunately, this means we don't always look ahead as much as we can.
For example, consider the following tree:
o
/|\
o o o
/| | |\
o o o o o
The current walker will fetch as follows:
Fetch the first layer:
?
/|\
o o o
/| | |\
o o o o o
When we get the root, fetch it's first 10 children in parallel:
x
/|\
? ? ?
/| | |\
o o o o o
When we get the first block, fetch it's first 10 children in parallel.
x
/|\
x ? ?
/| | |\
? ? o o o
However, given this graph, we're now pre-fetching at most 4 blocks. Worse, once we fetch the first two leaf nodes, we'll end up in the following state:
x
/|\
x x ?
/| | |\
x x o o o
At this point, we haven't started pre-fetching the next node. That means we'll have to pause before we can continue.
What should we be doing? At a minimum, we should be prefetching at least 10 nodes at each layer. Ideally, we'd prefetch 10 nodes at each layer at a time until we've prefetched N nodes total (DFS order), where N is configurable.
When walking an IPLD dag, the DagWalker prefetches the next 5-15 siblings of each node it fetches. Unfortunately, this means we don't always look ahead as much as we can.
For example, consider the following tree:
The current walker will fetch as follows:
However, given this graph, we're now pre-fetching at most 4 blocks. Worse, once we fetch the first two leaf nodes, we'll end up in the following state:
At this point, we haven't started pre-fetching the next node. That means we'll have to pause before we can continue.
What should we be doing? At a minimum, we should be prefetching at least 10 nodes at each layer. Ideally, we'd prefetch 10 nodes at each layer at a time until we've prefetched N nodes total (DFS order), where N is configurable.