application-research / estuary

A custom IPFS/Filecoin node that makes it easy to pin IPFS content and make Filecoin deals.
https://docs.estuary.tech
Other
240 stars 67 forks source link

Advertise child CIDs for retrieval #54

Open brendalee opened 2 years ago

brendalee commented 2 years ago

Currently Estuary is only advertising root CIDs. There are customers (such as the zarr-wg) who have retrieval needs which require them to be able to retrieve the child CID content fast.

whyrusleeping commented 2 years ago

Do we need all child content advertised (even individual leaf nodes) or do we want things advertised at the file level? And what do we think is a good UX here? should it be a flag that gets set per blob they upload? I assume they are pinning directories, which we could crawl through and enumerate all the files in in order to advertise those.

sheriflouis-FF commented 2 years ago

Given that this was created for a particular use case, I would say we would want things advertised at the file level. They are using the ipfs pin remote add to pin directories. I cannot answer the flag that gets set per blob", what we are trying to achieve is that a user can list/access a file within a directory, and could track the child CID within estuary (whether it is pinned, and whether it has a deal on filecoin).

brendalee commented 2 years ago

Would advertising only files break some retrievals though? Based on how retrievals in IPFS work today, seems like if we don't advertise all child content, there can be cases when someone has already retrieved part of the file, but when trying to get the next "piece" of the file will need the specific child CID otherwise isn't smart enough to traverse back up to figure out which file it was?

stastnypremysl commented 2 years ago

It would be great, if Estuary advertised all child CIDs. Now, it doesn't behave as normal IPFS node as expected, which is confusing.

whyrusleeping commented 2 years ago

@stastnypremysl what is your usecase for this? I dont imagine you actually want every last block to be advertised, but more likely what you want is 'all the roots of files in this directory ive pinned', or something to that effect

stastnypremysl commented 2 years ago

It's about file deduplication.

Eg. Lets have a large csv dataset with temperatures and it is only growing. With child CID propagation, everyone downloading a newer dataset of these temperatures will be able to download a part from it from Estuary.

Nov 30, 2021 19:36:48 Whyrusleeping @.***>:

@stastnypremysl[https://github.com/stastnypremysl] what is your usecase for this? I dont imagine you actually want every last block to be advertised, but more likely what you want is 'all the roots of files in this directory ive pinned', or something to that effect

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[https://github.com/application-research/estuary/issues/54#issuecomment-982910235], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ABBM5JY6JGDPU7E43S2IKPTUOUKTRANCNFSM5F6CC7HA]. Triage notifications on the go with GitHub Mobile for iOS[https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675] or Android[https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub]. [data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD8AAAA/CAYAAABXXxDfAAAAAXNSR0IArs4c6QAAAARzQklUCAgICHwIZIgAAAAmSURBVGiB7cEBDQAAAMKg909tDwcUAAAAAAAAAAAAAAAAAAAAJwY+QwABivJx1AAAAABJRU5ErkJggg==###24x24:true###][Tracking image][https://github.com/notifications/beacon/ABBM5JZ7W6RBA3FLEBIUZXDUOUKTRA5CNFSM5F6CC7HKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOHKLAKGY.gif]

lidel commented 2 years ago

cc https://github.com/ipfs/go-ipfs/issues/8676 (proposal to have smarter Reprovider.Strategy for UnixFS DAGs)

corinne-antonia commented 2 years ago

@lidel @brendalee Where do you think we are on this issue?

brendalee commented 2 years ago

Haven't gotten many clients asking for this in the past few months