non-duplicating pinned storage for large datasets

ipfs / notes

IPFS Collaborative Notebook for Research

MIT License

401 stars 30 forks source link

non-duplicating pinned storage for large datasets #426

Closed tmbdev closed 4 years ago

tmbdev commented 4 years ago

My understanding is that if I "ipfs add" a file to IPFS with the default store, its contents will effectively be duplicated as the data gets stored again in the underlying IPFS storage layer. When trying to pin large datasets, this may be larger than available storage.

It would make sense if there was a storage layer that could refer to byte ranges in existing files on disk in order to save on storage. Does such a storage layer exist already?

Stebalien commented 4 years ago

Yes. https://github.com/ipfs/go-ipfs/blob/master/docs/experimental-features.md#ipfs-filestore

It's experimental because, if you delete the underlying file, the data will disappear from IPFS .

tmbdev commented 4 years ago

Thanks. I think this is a really important feature. The "urlstore" is also very useful, since it lets people pin and distribute files that are kept in public cloud buckets by running a small ipfs server in the cloud.