rs-ipfs / rust-ipfs

The InterPlanetary File System (IPFS), implemented in Rust.
Apache License 2.0
1.26k stars 163 forks source link

ipfs::unixfs::cat prefetching or prefetching in general #341

Open koivunej opened 4 years ago

koivunej commented 4 years ago

A larger issue: when for example cat'ing an UnixFs file the slowest possible way is the one we are currently doing:

  1. fetch a block
  2. yield file contents
  3. inspect it's links
  4. select next applicable link, goto 1

A faster way would be to go ahead of time fetch more blocks but of course the issue is not so clear. It might be a good idea to study how go-ipfs and js-ipfs have solved this issue.

WIP code in https://github.com/koivunej/rust-ipfs/tree/refs_to_ipfs_streaming_input. In the branch I made Ipfs::refs take in a Stream of IpfsPath, which I then used to start refs for the pending links in different ways. Going totally unbounded traversals was of course fastest, probably limited by the capacity of how much tokio's stdout adapter could push. Did not immediatedly figure out any wiser strategy, even though I played around with starting to prefetch only when the list of links started contracting, or doing it in reverse so to download the beginning and the end at the same time.

koivunej commented 4 years ago

Related and possible cause: https://github.com/rs-ipfs/rust-ipfs/blob/347bf8af2fe3d4b32d576a65e77350060626eddc/src/p2p/behaviour.rs#L456-L462

Testing the prefetching without starting any queries (just commenting the self.kademlia.get_providers line out) could be a proper next step.