Open donothesitate opened 7 years ago
go-ipfs version: 0.4.4
Better suited for maximizing deduplication ratio then current Rabin chunker. Using smaller chunks with faster convergence yields greater space savings, and the benefit depending on dataset can be great in comparison to Rabin.
The mean chunk size used by tarsnap is 64k.
Source: https://github.com/Tarsnap/tarsnap/blob/master/tar/multitape/chunkify.h https://github.com/Tarsnap/tarsnap/blob/master/tar/multitape/chunkify.c
Related: https://moinakg.wordpress.com/2012/11/11/inside-content-defined-chunking-in-pcompress/ https://moinakg.wordpress.com/2012/11/15/inside-content-defined-chunking-in-pcompress-part-2/
Note to self: move to ipfs/importers repo when we make that
Version information:
go-ipfs version: 0.4.4
Type: Feature, Enhancement
Priority: P4
Area: Tools, Importer
Description:
Better suited for maximizing deduplication ratio then current Rabin chunker.
Using smaller chunks with faster convergence yields greater space savings, and the benefit depending on dataset can be great in comparison to Rabin.
The mean chunk size used by tarsnap is 64k.
Source: https://github.com/Tarsnap/tarsnap/blob/master/tar/multitape/chunkify.h https://github.com/Tarsnap/tarsnap/blob/master/tar/multitape/chunkify.c
Related: https://moinakg.wordpress.com/2012/11/11/inside-content-defined-chunking-in-pcompress/ https://moinakg.wordpress.com/2012/11/15/inside-content-defined-chunking-in-pcompress-part-2/