jmacd / xdelta

open-source binary diff, delta/differential compression tools, VCDIFF/RFC 3284 delta compression
http://xdelta.org
1.09k stars 181 forks source link

Problem with decoding here address #253

Closed ivan386 closed 5 years ago

ivan386 commented 5 years ago

I try to decode produced by xdelta3 address with this lua functions:

    local decode = function(copy_mode, copy_size)
        local address, mode_name, address_value, cache_value  = decode_address(address_stream, copy_mode, address_cache)
        update_target( address_cache, copy_size )

        local source_address = ( address_cache.segment_length + address ) % address_cache.segment_length
        source_address = address_cache.segment_position + source_address
        return source_address, mode_name, copy_mode, address_value, address, cache_value
    end

function decode_address(address_stream, copy_mode, address_cache)
    local address
    local mode_name
    local same_index
    local address_value
    local cache_value
    if copy_mode == 0 then
        mode_name = "self"
        address = read_int( address_stream )
    elseif copy_mode == 1 then
        mode_name = "here"
        address_value = read_int( address_stream )
        address = address_cache.target_address - address_value
    elseif copy_mode < 2 + address_cache.near_size then
        mode_name = "near"
        local near_index = copy_mode - 2
        address_value = read_int( address_stream )
        cache_value = address_cache.near[near_index]
        address = cache_value + address_value
    elseif copy_mode < 2 + address_cache.near_size + address_cache.same_size then
        mode_name = "same"
        local same_mode = copy_mode - (2 + address_cache.near_size)
        address_value = read_byte( address_stream )
        same_index = same_mode * 256 + address_value
        address = address_cache.same[same_index]
        if not address then
            debug_values()
            assert(address)
        end
    end

    return cache_update( address_cache,  address), mode_name, address_value, cache_value
end

In output below: RA - result address (result of self, here, near or same parts) CV - cache value (value from near or same cache) AV - address value (value from address stream)

First problem negative address (RA: -10049)

VCDIFF copy window length:    259058387
VCDIFF copy window offset:    1076781

my code decode: 147733 035 CPY_1 159 S@1066732 here RA: -10049 AV: 157782

here     address stream value (AV)      result address(RA)
147733 - 157782                       = -10049

Is it right to fix it with this code?:

        local source_address = ( address_cache.segment_length + address ) % address_cache.segment_length
        source_address = address_cache.segment_position + source_address

% - mod

  copy window length     result address     copy window length    offset from window start
( 259058387     +       (-10049) )     %    259058387           = 259048338

offset from window start       copy window offset     source address
259048338         +            1076781              = 260125119 
VCDIFF version:               0
VCDIFF header size:           5
VCDIFF header indicator:      none
VCDIFF secondary compressor:  lzma
VCDIFF window number:         0
VCDIFF window indicator:      VCD_SOURCE VCD_ADLER32 
VCDIFF adler32 checksum:      BB755749
VCDIFF copy window length:    259058387
VCDIFF copy window offset:    1076781
VCDIFF delta encoding length: 1233613
VCDIFF target window length:  8388608
VCDIFF data section length:   982615
VCDIFF inst section length:   174085
VCDIFF addr section length:   76895
  Offset Code Type1 Size1  @Addr1 + Type2 Size2 @Addr2

----------------xdelta3 out----------------------
  147614 099  CPY_5     90 S@1171318
  147704 001  ADD       29        
  147733 035  CPY_1    159 S@260125119
  147892 005  ADD        4        
  147896 067  CPY_3    184 S@260129736
  148080 005  ADD        4        
  148084 000  RUN      184        
  148268 001  ADD       21        
  148289 000  RUN      167        
  148456 001  ADD       31        
  148487 000  RUN      157        
  148644 005  ADD        4        
  148648 067  CPY_3     27 S@260129547
  148675 009  ADD        8        
  148683 051  CPY_2    149 S@1308269
  148832 005  ADD        4        
------------------------------------------------

---------------my script out--------------------
  147614 099  CPY_5     90 S@1171318    near    RA: 94537   AV: 368 CV: 94169
  147704 001  ADD       29        
  147733 035  CPY_1    159 S@260125119  here    RA: -10049  AV: 157782
  147892 005  ADD        4        
  147896 067  CPY_3    184 S@260129736  near    RA: -5432   AV: 4617    CV: -10049
  148080 005  ADD        4        
  148084 000  RUN      184        
  148268 001  ADD       21        
  148289 000  RUN      167        
  148456 001  ADD       31        
  148487 000  RUN      157        
  148644 005  ADD        4        
  148648 067  CPY_3     27 S@260129547  near    RA: -5621   AV: 4428    CV: -10049
  148675 009  ADD        8        
  148683 051  CPY_2    149 S@1308269    near    RA: 231488  AV: 136951  CV: 94537
  148832 005  ADD        4        
  148836 051  CPY_2    184 S@1308418    near    RA: 231637  AV: 149 CV: 231488
  149020 005  ADD        4        
------------------------------------------------

Second problem wrong address

VCDIFF copy window offset: 124004297 xdelta3 decode: 134227622 035 CPY_1 162 S@284037645 my code decode: 134227622 035 CPY_1 162 S@258221863 here RA: 134217566 AV: 10056

If i try to encode back address that give xdelta3 than in address stream will be negative value.

         here         address(S@)  copy window offset
xdelta3: 134227622 - (284037645  - 124004297         ) = -25805726
                      RA: 160033348

my:      134227622 - (258221863  - 124004297         ) = 10056 (AV: 10056)
                      RA: 134217566
----------------------------------------------------

VCDIFF window number:         16
VCDIFF window indicator:      VCD_SOURCE VCD_ADLER32 
VCDIFF adler32 checksum:      8C1E6B12
VCDIFF window at offset:      134217728
VCDIFF copy window length:    160033510
VCDIFF copy window offset:    124004297
VCDIFF delta encoding length: 596916
VCDIFF target window length:  8388608
VCDIFF data section length:   330685
VCDIFF inst section length:   183279
VCDIFF addr section length:   82934
  Offset Code Type1 Size1  @Addr1 + Type2 Size2 @Addr2

----------------xdelta3 out----------------------
  134227622 035  CPY_1    162 S@284037645
  134227784 005  ADD        4        
  134227788 035  CPY_1    184 S@276980818
  134227972 005  ADD        4        
  134227976 000  RUN      184        
  134228160 005  ADD        4        
  134228164 051  CPY_2     28 S@277928225
  134228192 008  ADD        7        
  134228199 099  CPY_5    149 S@284037645
  134228348 005  ADD        4        
  134228352 035  CPY_1    184 S@276980805
  134228536 005  ADD        4        
  134228540 000  RUN      184        
  134228724 005  ADD        4        
  134228728 051  CPY_2     40 S@277174342
  134228768 008  ADD        7        
  134228775 083  CPY_4    137 S@284037645
  134228912 005  ADD        4        
  134228916 035  CPY_1    184 S@276980793
  134229100 001  ADD       21        
------------------------------------------------

---------------my script out--------------------
  134227622 035  CPY_1    162 S@258221863   here    RA: 134217566   AV: 10056
  134227784 005  ADD        4        
  134227788 035  CPY_1    184 S@251165036   here    RA: 127160739   AV: 7067049
  134227972 005  ADD        4        
  134227976 000  RUN      184        
  134228160 005  ADD        4        
  134228164 051  CPY_2     28 S@252112443   near    RA: 128108146   AV: 947407  CV: 127160739
  134228192 008  ADD        7        
  134228199 099  CPY_5    149 S@258221863   near    RA: 134217566   AV: 0   CV: 134217566
  134228348 005  ADD        4        
  134228352 035  CPY_1    184 S@251165023   here    RA: 127160726   AV: 7067626
  134228536 005  ADD        4        
  134228540 000  RUN      184        
  134228724 005  ADD        4        
  134228728 051  CPY_2     40 S@251358560   near    RA: 127354263   AV: 193524  CV: 127160739
  134228768 008  ADD        7        
  134228775 083  CPY_4    137 S@258221863   near    RA: 134217566   AV: 0   CV: 134217566
  134228912 005  ADD        4        
  134228916 035  CPY_1    184 S@251165011   here    RA: 127160714   AV: 7068202
  134229100 001  ADD       21      
------------------------------------------------
ivan386 commented 5 years ago

My mistake. here must begin from segment_length.

https://github.com/ivan386/lua-print-vcdiff/commit/11aea97ec5b985ba082853a450aac1a1863b2ab3#diff-bc00a9f0ccd2c192e0cebbe9a405d298R211