projg2 / squashdelta

Create efficient deltas (patches) between two SquashFS images
BSD 2-Clause "Simplified" License
14 stars 8 forks source link

Can't create deltas of big changes #3

Closed Viterkim closed 6 years ago

Viterkim commented 6 years ago

When i try to create a delta between 2 squashfs files compressed with lz4 i get an error (does not happen on smaller files). The difference between the files is around 70Mb.

squashdelta base/rootfs.lz4.squashfs.1 base/rootfs.lz4.squashfs.3 newfile.patch
Source: base/rootfs.lz4.squashfs.1
Reading inodes...
Read 2663 inodes in 13 blocks.
Hashing 13 inode blocks...
Reading fragment table...
Read 294 fragments in 1 blocks.
Hashing 1 fragment table blocks...
Hashing 773 data blocks...
Total: 787 compressed blocks.

Target: base/rootfs.lz4.squashfs.3
Reading inodes...
Read 5467 inodes in 25 blocks.
Hashing 25 inode blocks...
Reading fragment table...
Read 536 fragments in 2 blocks.
Hashing 2 fragment table blocks...
Hashing 1640 data blocks...
Total: 1667 compressed blocks.

Unique blocks found: 573 in source and 1453 in target.
Writing expanded source file...
Writing expanded target file...
Program terminated abnormally:
    write() failed
    at temporary file for target
    errno: Bad address
mgorny commented 6 years ago

Could you publish those files somewhere? I suppose it might be easier for me to debug it if I can reproduce it myself.

Viterkim commented 6 years ago

https://drive.google.com/open?id=1Zp0x0y2iE2jBqYZhc8QyvqQM_dFXSZwh Here ya go

mgorny commented 6 years ago

Thanks. I can reproduce it, and I'll try to find time to debug it today.

Viterkim commented 6 years ago

Thank you so much! I'm trying to compare delta solutions and this just randomly popped up. Is it true that the software doesn't support lzma/gzip by the way?

mgorny commented 6 years ago

Yep, see #2. It's not like it'd hard to do, I just never had a compelling reason to do it.

mgorny commented 6 years ago

Looks like it's getting EFAULT when trying to write a buffer from mmap(). I think this is some recent change in kernel and/or glibc that breaks it. I suppose the best way forward would be to stop playing with mmap() and read the file the old-school way.

Viterkim commented 6 years ago

Sure, thanks for the update! And I'm unfortunately i don't have the skillset to add the functionality myself, so i guess i'll just use lz4 for now :)

Viterkim commented 6 years ago

A stupid question: You create deltas with squashdelta file1 file2 delta, but how do you apply the delta file to a file?

mgorny commented 6 years ago

You use squashmerge for that. The idea was that merging is easier, so we can write a tool in plain C rather than C++.

mgorny commented 6 years ago

Ok. So I've did a big regutting of squashdelta finally, and got rid of mmap I/O. Now the 'bad address' error is replaced by EOF exception. Which means there's either something wrong with the file or the parser, and the whole regutting was entirely unnecessary ;-).

Viterkim commented 6 years ago

oh feelsbadman I'm sorry if it's the file... I did try it on multiple big files, maybe it's an edge case on the files i produce?

mgorny commented 6 years ago

No, I think it's memory corruption somewhere in my code.

mgorny commented 6 years ago

I'm making progress. It seems that I failed to account for the possibility of the same block being used in multiple inodes. Furthermore, it seems that there are blocks with zero length which I have to investigate.

mgorny commented 6 years ago

Good news is that I've managed to fix the underlying issue. Bad news is that in the meantime I've discovered that LZ4 is unstable, and you can't merge the result anymore since it compressed bigger than the original.

mgorny commented 6 years ago

(i.e. the new version reduces compression)