storj-archived / sips

Storj Improvement Proposals.
GNU General Public License v3.0
14 stars 11 forks source link

Data deduplication with erasure encoding #26

Open MeijeSibbel opened 7 years ago

MeijeSibbel commented 7 years ago

Combine erasure coding with data deduplication to simultaneously reduce the overall redundancy in data while increasing the redundancy of unique data. Deduplication also requires less network transfer.

http://shiftleft.com/mirrors/www.hpl.hp.com/personal/Mark_Lillibridge/Reliability/website_draft.pdf https://en.wikipedia.org/wiki/Data_deduplication

braydonf commented 7 years ago

Any compression would need to be before encryption, as compressing encrypted data will have little benefit. Right now this is handled by zipping/tar-ing before uploading.