darlinghq / darling-dmg

FUSE module for .dmg files (containing an HFS+ filesystem)
http://www.darlinghq.org
GNU General Public License v3.0
273 stars 45 forks source link

wrong adc compression algorithm #72

Closed jief666 closed 6 years ago

jief666 commented 6 years ago

The adc compression function in adc.c wasn't design to handle block boundaries. One problem was when the input buffer contains a partial chunk at the end. Another big problem is that it's always needed to keep 65535 bytes before restart decompression in case of a lookback.

The solution used here is to read in a buffer that is 2 times bigger than 65535 (in fact, I used 65536) + the max size of a chunk. Each time decompression reach 65536*2, the second part of the buffer is shifted toward the beginning.

Because decompress takes now a 3rd arg (the offset), I've refactored a bit so decompress method for DMGDecompressor_Zlib and DMGDecompressor_Bzip2 takes also a 3rd arg. I though it would be more consistent to have offset and count, like in all read method.

tomkoen commented 6 years ago

@jief666 off_t is always 32-bit in Visual Studio and in *nix/mac it's 64-bit if __USE_FILE_OFFSET64 is defined. woudn't it be better to use int32_t instead? decompress methods don't need 64-bit numbers

jief666 commented 6 years ago

off_t is 32 bits in 64 bits windows ??? How are defined windows function that take offsets ? Like lseek equivalent ? Decompress methods needs 64 offsets for big file > 2GB, doesn't it ?

tomkoen commented 6 years ago

Absolutely 32-bit. I checked the headers.

typedef long off_t;

off_t is from the *nix world, Visual Studio doesn't use it for file offsets.

For big files one has to use _lseeki64 with __int64 type:

  long _lseek(  
   int fd,  
   long offset,  
   int origin   
);  
__int64 _lseeki64(  
   int fd,  
   __int64 offset,  
   int origin   
);  

Decompress methods needs 64 offsets for big file > 2GB, doesn't it ?

then int64_t would be good.

tomkoen commented 6 years ago

FileReader.cpp uses

int32_t read(void* buf, int32_t count, uint64_t offset) override;
jief666 commented 6 years ago

I used off_t by habit. I'll update that.

tomkoen commented 6 years ago

https://github.com/darlinghq/darling-dmg/pull/74

jief666 commented 6 years ago

Good you did, thanks. Can be close, I think.