attic-labs / noms

The versioned, forkable, syncable database
Apache License 2.0
7.44k stars 266 forks source link

Introduce Sloppy #3631

Closed ghost closed 7 years ago

ghost commented 7 years ago

Introduce Sloppy - an estimating compression function for snappy - which allows for the rolling hash to better produce a given target chunk size after compression.

Fixes #2763, Fixes #3369

ghost commented 7 years ago

FYI, tests of encoding structured data show that this reduces the number of chunks by about 35-40% and the overall size on disk by about 10-15%.