Open gingerbeardman opened 1 year ago
This reproduces consistently for me with version of playonmac 4.4.3 which is the current most recent version for me.
Set a breakpoint on the message output: before afsctool reverts the compression, the file ends up containing all zeros
Focusing on one of the files which show this behavior: Contents/Resources/unix/wine/lib64/wine/api-ms-win-core-registry-l2-2-0.dll
The file fits in one block. It seems that when compressed with zlib, the file is compressed into the resource fork (it isn't compressed enough to fit in the decmpfs xattr). However, when compressed with lzfse, it does compress enough to fit into the decmpfs xattr.
It's not clear why this is a problem though. I've verified that the decmpfs xattr written is exactly the same when it's compressing a fresh version of the dll and when compressing after it's already been compressed and decompressed, it's not clear what could be carrying over to change things: with the same decmpfs xattr contents, enabling the compressed flag in one case works, and in the other case results in a file which reports to contain all zeros.
It really feels like a macos bug, I don't see any differences I can observe in the file before/after a compress+decompress cycle, but we're getting different behavior.
I believe this is a macos bug: if a file has been compressed into a resource fork, uncompressing the file, and recompressing directly into the decmpfs xattr, the file will appear as all zeros (of the correct len).
I've created a minimal reproduction: https://gist.github.com/Dr-Emann/3d0f272d08c68a889fa368597a10cea7. If SKIP_RFORK_COMPRESSION_FINISH
is defined (with -DSKIP_RFORK_COMPRESSION_FINISH=1
when compiling the c code), the file will be correct. Without the define, the file will instead be filled with zeros.
It's not clear why this is a problem though. I've verified that the decmpfs xattr written is exactly the same when it's compressing a fresh version of the dll and when compressing after it's already been compressed and decompressed
So you have verified that the compressed content is correct when you read it back back in? Theoretically that should mean that it decompresses to the expected original content but did you verify that?
Sadly I can't work on this myself because 10.9 doesn't support LZFSE compressed files (so it's also a bit hard to wrap my head about what's going on). We could implement the decompression inside afsctool but that won't arrange anything if we can still produce compressed files that appear to be filled with zeroes to anyone else.
Unless of course that implementation reproduces the issue...
What I don't get is that the verification step doesn't catch this problem and rewrites the file from the backup...
Anyway, if you do feel that this is an OS bug: how up-to-date is your OS? If you're on a still supported version you could file a radar about this, and see what Apple have to say. My hardware is limited to 10.13 even if I wanted to upgrade so there's no point for me to file a ticket when the 1st request is going to be that I install an up-to-date OS version...
Should we just disable compression into the decmpfs xattr?
I'm a bit surprised that LZFSE can compress better than zlib ... did you try the ultimate zlib compression (-9)?!
Results are compressing an app copied fresh out of the DMG.
❯ time afsctool -c -T ZLIB -9 PlayOnMac.app
/Users/matt/Downloads/2023-04-26/PlayOnMac.app:
Number of HFS+/APFS compressed files: 4561
================
CPU 98%
TIME 2:26.47
❯ time afsctool -c -T LZFSE PlayOnMac.app
/Users/matt/Downloads/2023-04-26/PlayOnMac.app:
Number of HFS+/APFS compressed files: 4570
================
CPU 97%
TIME 1:05.79
I'm using a MacBook Pro (M1 Pro CPU, 16GB RAM).
OK, my bad, I don't ever use that compression...
No worries! It's an eye-opener, for sure.
No worries! It's an eye-opener, for sure.
Especially the part where a straight-C compressor outperforms (speed-wise) zlib which does have SIMD optimisations. I'll try to compare the timing difference on my system which has CloudFlare's highly optimised version.
Please do, I love a good benchmark.
I wonder how much ARM optimisation Apple have done to these algos. I'm betting "a lot"!
LZFSE is highly optimized for ARM so not big surprise
LZFSE is highly optimized for ARM so not big surprise
You mean such that even plain C compiles to something highly efficient?
on Intel mac:
❯ time afsctool -c -T ZLIB -9 PlayOnMac.app 2m52s 650 MB
❯ time afsctool -c -T LZFSE PlayOnMac.app 1m40s 599 MB
so looks like LZFSE is just well done:)
I think there's some confusion.
After a file has had compression turned on with a resource fork, compressing the same file into the decmpfs header and turning the compression flag on results in a file that appears to be filled with zeros. The verification step DOES catch this (the original content is not all zeros), and correctly reverts compression.
I verified that the same decmpfs header content works if the file has not already been compressed to a resource fork in the past, but fails if the file has been compressed to a resource fork in the past.
There's nothing compressor specific: it should be possible to find a file which compresses to the resource fork at zlib -1, and fits in the decmpfs header at zlib -9, and it would exhibit the same behavior, compressing with zlib -1, uncompressing, then compressing with zlib -9 would end up with a file containing all zeros (and validation would revert, if enabled).
I'm running macos 13.3.1, the newest version. Feedback reported through the feedback assistant as FB12146617
I verified that the same decmpfs header content works if the file has not already been compressed to a resource fork in the past, but fails if the file has been compressed to a resource fork in the past.
So the issue is triggered by the following sequence?
1) compress a "new" (uncompressed) file such that its contents end up in the resource fork (e.g. with ZIP) 2) uncompress the file 3) recompress the resulting file with LZFSE so that its contents go into the xattr
(Please confirm that you can NOT skip step 2; afsctool should reject files that are already HFS-compressed.)
What happens if you replace step 2 with something like cat $file1 > $file2
, i.e. an operation that should remove the entire resource fork? I have never checked if afsctool removes the resource fork if it doesn't contain anything else (and I'm not at my Mac right now).
Indeed, it should be possible to create a file that will just be too large when zipped with a lower compression level, to see if that triggers the same bug.
Yes, those steps reproduce the issue. I can confirm that skipping step two does not reproduce the problem.
Replacing step 2 with cat $file1 > $file2
, then compressing $file2 with lzfse does not reproduce the problem. Inserting it as a step between 2 and 3 (or using cp
between step 2 and 3) also does not reproduce the problem: The sha256sum of the file is the same after each step and it seems to be in some way actually tied to the file identity.
sha256sum of the file is the same after each step and it seems to be in some way actually tied to the file identity.
This smells of APFS copy in place. I'll try to test on an HFS Extended partition.
This smells of APFS copy in place.
Yes, but it could also hint at my hunch that it is somehow tied to something to do with the resource fork, possibly something as stupid as not cleaning it up entirely. Except that removeResourceFork
does seem to get set when it should, so the xattr and the resourcefork bits always get removed when needed. Do you know of tools to inspect those aspects of files? The xattr
command on my OS doesn't list any attributes on files that I know to be compressed...
See the commit message of #8b89d8f20e8dbb3efddbbca1f9f7852318727d87
It looks like inodes have some kind of memory?!?!
Repro
afsctool -c /Applications/PlayOnMac.app
(no errors)afsctool -d /Applications/PlayOnMac.app
(no errors)afsctool -c -T LZFSE /Applications/PlayOnMac.app
(errors!)Comparison
afsctool -c -T LZFSE /Applications/PlayOnMac.app
(no errors)Errors