For Ignite's current project, they receive blocks via graphsync, which ensures the order of blocks as per the IPLD selector, just like v1's selective writer. However, we might receive duplicate blocks from a client. When graphsync receives blocks they end up getting "Put" to our carv2 read-write blockstore.
If we want to be compatible, we should support deduplicating by CID. I propose a ReadWrite blockstore option for it, like DeduplicateByCID; if one calls Put on the same CID twice, the second call will simply do nothing and return a nil error.
In the future we could satisfy this need by porting Selective Writers to carv2 (https://github.com/ipld/go-car/issues/104), but that can't happen for another month or two.
I could also ask Ignite to implement a Blockstore wrapper that does this deduplication on Put calls, but deduplicating by CID also seems like a reasonable opt-in feature that others might want in the future. It wouldn't make the API significantly more complex or the read-write blockstore significantly slower, either.
Filecoin writes proofs into CAR files which are hashed, so we need their contents to be deterministic.
The way Filecoin currently generates those CARv1 files is via v1's selective writer API, which ensures canonical ordering via traversals, and also deduplicates by CID: https://github.com/ipld/go-car/blob/71cfa2fc2a619d646606373c5946282934270bd4/selectivecar.go#L229-L230
For Ignite's current project, they receive blocks via graphsync, which ensures the order of blocks as per the IPLD selector, just like v1's selective writer. However, we might receive duplicate blocks from a client. When graphsync receives blocks they end up getting "Put" to our carv2 read-write blockstore.
If we want to be compatible, we should support deduplicating by CID. I propose a ReadWrite blockstore option for it, like
DeduplicateByCID
; if one callsPut
on the same CID twice, the second call will simply do nothing and return a nil error.In the future we could satisfy this need by porting Selective Writers to carv2 (https://github.com/ipld/go-car/issues/104), but that can't happen for another month or two.
I could also ask Ignite to implement a Blockstore wrapper that does this deduplication on Put calls, but deduplicating by CID also seems like a reasonable opt-in feature that others might want in the future. It wouldn't make the API significantly more complex or the read-write blockstore significantly slower, either.