Wyng currently marks all-zero chunks in the manifest without saving a corresponding data chunk file. This might be expanded to other patterns consisting of 8, 16, or 32 bits.
The send-time test for such chunks could be simple: Check for the chunk's first byte(s) being repeated for the remainder of the chunk. This would quickly complete after comparing first few bytes for the vast majority of data chunks.
Before considering implementation, scan some volumes to generate histograms of different patterns and pattern sizes to see if the space savings could be substantial.
Wyng currently marks all-zero chunks in the manifest without saving a corresponding data chunk file. This might be expanded to other patterns consisting of 8, 16, or 32 bits.
The send-time test for such chunks could be simple: Check for the chunk's first byte(s) being repeated for the remainder of the chunk. This would quickly complete after comparing first few bytes for the vast majority of data chunks.
Before considering implementation, scan some volumes to generate histograms of different patterns and pattern sizes to see if the space savings could be substantial.