Open hannahhoward opened 3 years ago
Cc @whyrusleeping @dirkmc @aarshkshah1992 @raulk for review
I want to point out why you want this AS WELL AS very good selectors.
We already have a StopAt selector in latest versions of go-ipld-prime: https://github.com/ipld/go-ipld-prime/pull/214
However, particularly in the retrieval case, the client may not know how to assemble this selector ahead of time. If I make a deal for a complex DAG with several missing pieces, for a client to retrieve this with a selector they need to know ahead of time what pieces are missing. This is pretty tricky to communicate -- or it adds overhead to discovery mechanisms.
It seems ideal to still be able to serve a "not quite complete retrieval" as a fallback
We're interested in this feature. It would make packing bigger-than-a-sectors DAGs in sector-sized deals much simpler since we don't have to deal with "complete-subdags" constraints. So, just pack the max amount of blocks possible and let the retriever know that should retrieve X deals to get the complete thing.
If doing partial retrievals makes sense for the client, so then let that be an "application" constraint that should be considered while packing things in deals; but not really mandatory.
Checklist
Ideas
.Lotus component
What is the motivation behind this feature request? Is your feature request related to a problem? Please describe.
Let's say I want to store a large existing IPLD dataset larger than a sector on Filecoin. Currently, we face several obstacles:
Let's consider what we'd like to be possible:
We also already have alternate storage clients like Estuary that are failing proposed deals cause they are trying to send partial DAG data to miners.
Describe the solution you'd like
Fortunately, our underlying transport protocol for data transfer, Graphsync, can serve requests where the peer sending the data only has part of the DAG expressed by the requested CID+Selector. The Graphsync responder knows how to communicate to request what it served and what it didn't, and the requestor knows how to process this information and still verify the response.
Currently, the go-data-transfer library currently fails all transfers where the entire request root + IPLD selector is not served.
I propose that we allow data transfers to complete successful for a transfer that have only serves a partial response.
My proposed bubbling up to Lotus is as follows:
PartiallyCompleted
for when a transfer is done sending/receiving but the entire DAG was not served (plus possibly some additional events that put it in this state)ClientEventDataTransferComplete
when go-data-transfer ends inPartiallyCompleted
(the same event emitted when data transfer ends inCompleted
) and otherwise be unchangedProviderEventDataTransferCompleted
when go-data-transfer ends inPartiallyCompleted
(the same event emitted when data transfer ends inCompleted
) and otherwise be unchanged. The CommP calculation will be run on the received CAR file for the partial DAG and as long as it matches the Storage Proposal, the deal will continue as plannedClientEventPartiallyComplete
when data transfer ends with thePartiallyCompleted
status. This will trigger analogous "Partial" states forDealStatusCheckComplete
andDealStatusFinalizingBlockstore
, which will transition toDealStatusPartiallyComplete
as the retrieval client's final statusDealStatusPartiallyCompleting
and thenDealStatusPartiallyCompleted
when CleanupDeal is finished.Describe alternatives you've considered
see above -- while selectors are a path forward potentially they have several limitations and the path to achieving a desirable result through them is long
Additional context