google / open-vcdiff

An encoder/decoder for the VCDIFF (RFC3284) format
Apache License 2.0
186 stars 52 forks source link

Add option to optimize VCDIFF decoder when VCD_TARGET will not be used as source segment #9

Closed Steelskin closed 10 years ago

Steelskin commented 10 years ago

Original issue 9 created by openvcdiff on 2008-09-11T23:24:54.000Z:

The open-vcdiff decoder currently supports the full VCDIFF format as specified in RFC 3284. The open-vcdiff encoder uses only a subset of the features available in that format.

One example is using target data as the source segment. RFC 3284 allows each delta window to specify one source segment to be referenced by COPY instructions. This source segment can come either from the dictionary (also known as source file) or from anywhere in the previously decoded target file. The latter possibility means that the decoder must preserve the entire contents of the previously decoded target file in memory, or at least be prepared to load any given portion of the target file into memory on demand.

The open-vcdiff encoder will never use the previously decoded target file as the source window. It always uses a source window that starts at offset 0 of the dictionary and includes the entire dictionary contents.

If the decoder could be guaranteed that the encoder would never send a delta window that used previously decoded target data as the source window, then it would no longer have to save the previously decoded target file in memory, resulting in significant memory savings and eliminating one memcpy operation.

Proposal: add a decoder option that prohibits the use of VCD_TARGET as the source window type. If this option were enabled and a delta window containing the VCD_TARGET flag were passed to the decoder, it would fail with a decoding error.

The Xdelta encoder (another open-source VCDIFF package) does not use VCD_TARGET either, according to its author.

A restriction has been added to the Appendix of the SDCH protocol (http://sdch.googlegroups.com/web/Shared_Dictionary_Compression_over_HTTP.pdf) to prohibit the use of VCD_TARGET for the VCDIFF source segment in the context of SDCH.