RJVB / afsctool

This is a version of "brkirch"'s afsctool utility that allows end-users to leverage HFS+ compression.
https://brkirch.wordpress.com/afsctool
GNU General Public License v3.0
187 stars 18 forks source link

Add LZFSE support #39

Closed MegaByte closed 1 year ago

MegaByte commented 3 years ago

This adds LZFSE support using the reference library. Interface is nearly identical to LZVN. Also fixes removing resource fork when decompressing LZFSE. The chunk overlap detection code seemed to be working incorrectly, so I removed it.

RJVB commented 3 years ago

Thanks, I'll have a look, but as a matter of principle:

The chunk overlap detection code seemed to be working incorrectly, so I removed it.

Please document what didn't work and why you decided not to fix it (= why you think it is of no importance).

MegaByte commented 3 years ago

The code was triggering warnings on the LZFSE code, but not the LZVN code. Despite the warnings, the files I tested all decompressed correctly. When I looked at the values compared in each block for LZVN, ((UInt32)currBlock)[-1] and prevLast were always 0. I did not discover exactly what is going wrong, but I saw that the warning prints ((UInt32)currBlock)[-sizeof(UInt32)] rather than ((UInt32*)currBlock)[-1] so I am guessing there is an indexing logic error. I see the overall intent of the check, but I was curious as to how an overlap like that could actually occur and was hoping for more context there before trying to restore the warning.

RJVB commented 3 years ago

Thanks for the additional info. You're probably right that the overlap shouldn't occur but it's been several years now since I wrote this code and I haven't had to look at it. I presume I left in the check as a precaution against data loss,

I am guessing there is an indexing logic error

In that case it would have to be one that causes the test the succeed always in addition to being superfluous, without triggering a SEGV itself...

RJVB commented 3 years ago

When I looked at the values compared in each block for LZVN, ((UInt32*)currBlock)[-1] and prevLast were always 0.

Not impossible; the test checks for misalignment in the compressed output buffer that is being generated.

I did not discover exactly what is going wrong, but I saw that the warning prints ((UInt32)currBlock)[-sizeof(UInt32)] rather than ((UInt32)currBlock)[-1] so I am guessing there is an indexing logic error.

Yes, I think the warning should show the values being compared.

RJVB commented 3 years ago

Don't bother fixing the conflict, I have already done that locally. However, could I motivate you to implement a proper decompression code path, rather than letting the OS handle it? With that users of OS versions < 10.11 can also access LZFSE-compressed files, which would be nice.

MegaByte commented 3 years ago

I could take a look. FWIW, I've used the native path to verify that everything has been working correctly -- I compressed several million files on my machine including all my applications and it's worked flawlessly. I also tried purposely corrupting the encoding to confirm that it indeed does not write bad compression if it was to happen (assuming you don't turn off validation).

RJVB commented 3 years ago

Good job!

I could take a look. FWIW, I've used the native path to verify that everything has been working correctly

I'll be honest and admit that I took the native path because I was lazy, not as a way to test if everything worked correctly ... for that I could have used any external comparison tool ;)

RJVB commented 3 years ago

I could take a look.

BTW, if I were to do this I'd start with the compressed-in-the-xattr case, where you don't have to handle looping over the chunk table. I would probably drop the fall-through even on OS versions that support the compression type natively because 1) the same behaviour everywhere makes support easier 2) you already have the xattr in memory, so why read the file again (using fread())...

ylluminate commented 1 year ago

Bump

RJVB commented 1 year ago

LZFSE support has been included since v1.7.2 - apologies for forgetting to close this PR.