ipfs / go-ipld-format

IPLD Node and Resolver interfaces in Go
https://github.com/ipld/ipld
MIT License
64 stars 26 forks source link

Better read-ahead when walking. #53

Open Stebalien opened 5 years ago

Stebalien commented 5 years ago

When walking an IPLD dag, the DagWalker prefetches the next 5-15 siblings of each node it fetches. Unfortunately, this means we don't always look ahead as much as we can.

For example, consider the following tree:

    o
   /|\
  o o o
 /| | |\
o o o o o

The current walker will fetch as follows:

  1. Fetch the first layer:
    ?
   /|\
  o o o
 /| | |\
o o o o o
  1. When we get the root, fetch it's first 10 children in parallel:
    x
   /|\
  ? ? ?
 /| | |\
o o o o o
  1. When we get the first block, fetch it's first 10 children in parallel.
    x
   /|\
  x ? ?
 /| | |\
? ? o o o

However, given this graph, we're now pre-fetching at most 4 blocks. Worse, once we fetch the first two leaf nodes, we'll end up in the following state:

    x
   /|\
  x x ?
 /| | |\
x x o o o

At this point, we haven't started pre-fetching the next node. That means we'll have to pause before we can continue.


What should we be doing? At a minimum, we should be prefetching at least 10 nodes at each layer. Ideally, we'd prefetch 10 nodes at each layer at a time until we've prefetched N nodes total (DFS order), where N is configurable.