Optimizing Filecoin Retrieval TTFB

Currently, we use the following steps for retrieving data from Filecoin when we lack a CID in the local cache:

Query the indexer/Estuary
For every result returned, query each individual provider in parallel, but wait for all results to return.
Retrieve sequentially based on a sorting function.

There are a couple ways we can optimize this:

the Filecoin Indexer at minimum should contain information on whether the deal is verified in the results returned. We can use a deal being verified as a proxy for "likely free" and skip the query. It also contains the PieceCID which we could use to optimize on the provider side not having to go look for it.
if we start to get query responses back that meet the sort of best criteria in our sorting function (say anything that's free for example) we could just kick off our first retrieval and sort the remaining responses as they come in.

One other thing to factor in is how we want to abstract the additional data returned by the indexer that doesn't come from estuary (I think). Honestly we should think about this problem in general since for example Estuary can have a different "Root CID" while the index is always the same.

application-research / autoretrieve

Optimizing Filecoin Retrieval TTFB #102