Open Jakz opened 7 years ago
Ok, looks like I found the issue by inspecting the resulting vcdiff with printdelta
subcommand. The problem is that the input must remain valid until it is consumed, and when (and if) this is done depends on how you feed input to the encoder.
So according to comments:
xd3_encode_input
call and can be released after that callFor the last point I'd like to make an example to understand if I got it right, suppose window size is 10, you feed 3, 3, 3, 3. Input of first 3 chunks is copied in an internal buffer (so it can be released from the caller), on 4th chunk it happens that total buffer becomes 12 so a full window of 10 is moved to avail_in, it is processed and we have a leftover of 2 which is kept buffered.
Now if you feed a chunk that is larger than window size it will be buffered and split to a window size in any case? So if you have a leftover and you don't flush it you can't encode chunks larger than the window anymore in a single call, is this correct?
I'm trying to integrate xdelta3 inside a program I'm writing. I tried to follow all the guidelines from wiki, comments inside code and by stepping through the code itself to understand how it works but I'm not able to reliably produce good patches and I'm not understanding why.
Xdelta3 code is integrated in a pipe like structure in which a
process()
function is called and that function should read available input from a memory buffer and produce result in an output buffer (which is then managed outside that code).I'm able to produce a good patch only when windowSize == inputSize so that just a window is created. If I reduce the window size then the patch is generated but when applied to the source file it produces a different input file, and the differences are somewhat predictable so there must be something wrong in how I'm understanding the algorithm.
My understanding are that you start with some input, then
xd3_encode_input
will keep returningXD3_INPUT
until a full window is ready (orXD3_FLUSH
is set as with EOF), then it returns aXD3_WINSTART
, then it will get source data if available (by asking throughXD3_GETSRCBLK
all the time he needs). Then it will returnXD3_OUTPUT
multiple times to let the caller save all the produced output and finally aXD3_WINFINISH
. The loop goes this way until you reach EOF.Starting from this my code is the following:
and my process function is the following:
Now the code is quite similar to
xd3_process_stream
but it doesn't work when input size is greater than window size. For example if I fed a source.bin and input.bin (which is source.bin with 1024 random byte modified) of 32kb with a window size of 16kb, the resulting patch generated a a new file which has first 16kb different from original input and last 16kb identical. So this is surely something related in how I'm interfacing the encoder but I'm not able to understand what it could be.