filecoin-project / go-legs

Does the legwork for go-data-transfer
Apache License 2.0
20 stars 14 forks source link

`dtsync` never calls block hook if blocks are found locally #161

Closed masih closed 2 years ago

masih commented 2 years ago

The block hook mechanism in dtsync is tied to graphsync's IncomingBlockhook listener. Whenever blocks are received over graphsync exchange the block hook will be called.

Since #155 if blocks are present locally they are not synced over graphsync and a sync call completes immediately. This is great for avoiding duplicate downloads of the same blocks; however, the contract of block hook is to be called for all CIDs encountered during the sync. Other clients like storetheindex rely on this contract to then compile a list of advertisements that need to be processed for example.

Ideally, the sync mechanism should continue to avoid re-downloading the blocs that are locally present while still call the block hook with CIDs that would have been encountered if the block wasn't present locally.

This issue is blocking upgrade of go-legs in storetheidnex.

willscott commented 2 years ago

🤔 i wonder if there's a different completion handler we can use to know the transfer is finished - back-filling via blockhook seems awkward if things aren't actually coming in as blocks

masih commented 2 years ago

I know... the block hook is quite cumbersome to manage in the syncing logic. Changing that is a major refactor. My initial thought is to run the selector over the local linksystem to list the CIDs in the DAG that would have been synced if the block wasn't found locally and call the blockhook with that list.