Goals

This issue is to start a discussion with @elijaharita and the Estuary team. Essentially, the autoretrieve retrieval code is quickly becoming the most battle tested general purpose client to find and retrieve content from Filecoin SPs. Bedrock wants to extract it and use it for other projects. Rather than force people to rewrite the indexer -> SP query -> SP retrieval flow, and then debug and optimize it over and over, we want to have common code anyone can use, and benefit from the Bedrock team's ongoing maintainence and improvement to this code.

In the immediate future, we think this code can be used as is to support Saturn in falling back to SPs for content.

In the not so near future, there's a lot we can do.

For one, we love Filclient, but the minimal Filecoin client is not so minimal any more, and most of its code deals with storage. We'd like to write some well tested minimal baseline code for retrieval from Filecoin via Graphsync + Data transfer.

Also, the Bedrock team manages the retrieval protocols exposed by Boost and we will continue to innovate in this space. Rather forcing everyone to figure out how to update every time we introduce an improvement, we want folks to use a client we provide and then just reap the benefits of ready-made updates.

This brings us back to autoretrieve and the Estuary team's usage. We'd like to understand if the Estuary team would be ok with this approach, and want to come along with us. The benefit here would be massive work off your plate -- you can focus on innovating just the autoretrieve code that works for Estuary , while you'd be able to stop working about triaging and bug fixing individual retrieval problems -- instead you'd have a team of experts the most experience with these protocols in the PL network to do it for you. And of course to be clear you'd still have write/merge access to our repos. The downside is maybe you'd have less code you have direct ownership and final control of, and perhaps that's a dealbreaker. If that's the case, we still want to move forward, but we'll probably just extract the code and use it independently (though we may need to fork autoretrieve itself at some point to so we can iterate on a single set of code).

What Concretely Would Move

The filecoin directory
Some endpoint stuff (probably just the indexer version)
Some metrics stuff that's retrieval specific

cc: @kylehuntsman @rvagg

application-research / autoretrieve

Extracting retrieval code to a library #143

Goals

What Concretely Would Move