Open flyingzumwalt opened 7 years ago
@flyingzumwalt a good way to try this is to create file-format specific Rabin chunking.
FORM:
and other chunk markers{
}
, since it is used for functions in C-likes
;
is tricky, as JS will auto-convert newlines to itend
or similar to signify end of blockKeywords: Content Defined, Chunking, Deduplication
It might be good to do some research on FastCDC and Asymmetric Extremum, which has low computational overheads.
Based on the tests in #137 the rabin chunker isn't actually providing any real deduplication benefits. It's also really slow.