Closed gingerbeardman closed 6 years ago
thanks for the info... I can't do much with it though. Apparently Apple added support for new types of compression, I was unaware of that. But without a test system and exact knowledge how to decompress I'm not even going to try to implement support. Too risky.
To be honest, decompression is really the least interesting feature. You can do that also by simply copying the file or directory with a command that doesn't preserve the compression.
You're right, feel free to close
I found a note about type 8 here: http://newosxbook.com/tools/hfsleuth.html#DOWNLOAD
11/11/2016
- Development resumed,
- support for file compression (type 4, zlib) added
- support for file compression type 8 (for MacOS compression as of 10.10)
Last time I looked at other compression types I got the distinct impression that they're not of interest to end users, except possibly for decompression (but remember my earlier comment about that).
I also can't do much with references that don't provide source code!
Yes, it was meant more as a note to collect more info on this
Ah, sorry I did mean to post the source link: http://newosxbook.com/files/hfsleuth.tar
I saw that link, but the archive only contains binaries and a man page.
MAYBE type 8 compression is bzip2-based (which would be useful)?
i edited my comment to remove the hfsleuth download: it is not open source.
Yes, I think type 8 is another form of compression, and I am almost finished putting a table together.
Some types are Apple's own open-source LZFSE: https://github.com/lzfse/lzfse (macOS 10.11 and iOS 9 onwards)
My thinking is that may provide better/faster compression than zlib?
My thinking is that may provide better/faster compression than zlib?
That's the claim. Apple can make that true because its implementations are apparently tightly tuned for the hardware.
ALL are supported on both HFS+ and APFS
1 /* No compression; in xattr */ 2 /* (unused) */ 3 /* ZLIB; in xattr */ 4 /* ZLIB (64K chunked); in rsrc fork */ 5 /* (specifies de-dup within the generation store) */ 6 /* (unused) */
7 /* LZVN; in xattr */ 8 /* LZVN (64K chunked); in rsrc fork */
9 /* uncompressed; in xattr */ 10 /* uncompressed (64K chunked); in rsrc fork */ 11 /* LZFSE; in xattr */ 12 /* LZFSE; in rsrc fork */
0x80000001 /* faulting file: deprecated */
LZFSE also used for compression of DMGs. man hdiutil
[macOS 10.11; 2015]
References:
A quick test with lzfse on a typical big shared library showed slightly better and much faster compression compared to gzip and even CloudFlare's accelerated zlib.
However,
7 / LZVN in xattr / 8 / LZVN in rsrc fork / 9 / DECMPFS_TYPE_RAW_ATTR (LZFSE) / 10 / DECMPFS_TYPE_RAW_RSRC (LZFSE) /
Could you add the OS versions which add support for the various types and also if they're supported in HFS or only in APFS?
I hesitate to add compression support for types that aren't supported across the board. Decompression is another matter, but I'm a bit surprised that Apple would make it necessary to turn to 3rd party tools in order to read files on a HFS+ volume that was used on a more recent OS version. If those "new" types are in fact APFS specific that would change things - a bit.
I'm not exactly looking forward to diving in and adding (de)compressors, but suggestions which types to support and how (command line options, or automatic choice of an optimal algorithm, etc) welcome.
ps: LZFSE also used for compression of DMGs.
man hdiutil
From what OS version?
Thanks for the link to the apfs-fuse repo, btw. Could be useful.
sure, i've edited the above table. will continue to do so
LZFSE was backported from 10.11.0 to 10.10.2 (Jan 2015)
I'm just checking whether LZVN was also backported.
OK, I am quite confident that the versions/dates above are accurate.
There was no backporting to 10.8
LZFSE was back-ported from 10.11.0 to 10.10.2 (Jan 2015)
I want to support at least down to 10.9 (which I'm running myself).
I'm just checking whether LZVN was also back-ported.
Isn't that an compression protocol implemented in the Mach kernel? The description in the repo you pointed too is rather vague.
Yes, LZVN is used to compress the kernel
https://opensource.apple.com/source/copyfile/copyfile-138/copyfile.c.auto.html
Boom! A pretty complete list. I'll update my list here tomorrow.
Yes, LZVN is used to compress the kennel
Well, in that case ... it can go to the doghouse! ;)
Boom! A pretty complete list. I'll update my list here tomorrow.
Kudos ... I hate reading Apple code, and there's a LOT of it in that file (but judging from the name I guess it does everything that could interest me) ...
I can't take the credit, somebody else dug up the link to that code.
But it would be great if the "best" compression type for each OS could be supported? Do you think that you might plan to do that?
I can't take the credit, somebody else dug up the link to that code.
Finding the code is one thing, extracting the information from it another.
But it would be great if the "best" compression type for each OS could be supported? Do you think that you might plan to do that?
I've been thinking about decompression support in the back of my head, and about starting to prepare the code for additional compression types (with ideally runtime detection of the OS version so a single binary can work everywhere but only compress to types that are supported locally).
2 existential problems:
which lead to the bigger question whether or not it'd be really unwise to continue NOT to rewrite almost the entire code...
Table in https://github.com/RJVB/afsctool/issues/6#issuecomment-374620414 updated
Table https://github.com/RJVB/afsctool/issues/6#issuecomment-374620414 updated
So, apparently I should be able to generate HFS/LZVN-compressed files and still read them afterwards, on 10.9.5? The presence of AppleFSCompressionTypeLZVN.kext indeed seems to confirm that.
That would make it a bit more appealing to start hacking around, esp. if the required library is already part of the system Is that indeed the case (MacPorts doesn't seem to have a port:lzvn)?
Correct, you are good for LZVN on 10.9.
The code to deal with LZVN is part of the system, but I'm not sure if it's still private and/or easily accessible. Further reading: https://pikeralpha.wordpress.com/2014/11/01/lzvn-encode/ and related blog posts.
So there is a reverse engineered version: https://github.com/Piker-Alpha/LZVN (encode/decode)
I want to compress a file use lzfse on hfs+ filesystem, what cloud i do?ditto always uses zlib.I am on 10.11
I want to compress a file use lzfse on hfs+ filesystem, what cloud i do?
@lilin007007 you would have to add this feature yourself, if @RJVB is not planning to do so
@gingerbeardman I need some files in lzfse format for test. how do I create these files?
Adding new features isn't a high priority for me at the moment indeed. I am planning to reorganise the code a bit so it should become easier to add different compression formats. Summer (heat) is about over so I can start thinking about such things again :)
What I'm not really planning (yet) is upgrading my system (too much apprehension that too many things will break and I will spend weeks figuring out how to get the new OS to work the way I want it to)...
Just to be clear: afsctool is a utility to apply or remove filesystem-level compression. It is not the tool for the job if you just want to compress a file or two, for instance to copy or send them in compressed form. The way HFS+ and APFS compression work is that you can use the compressed files as if they weren't compressed - but they lose the compression as soon as you write to them, or copy them with a traditional utility. Technically speaking: the operating system will decompress these compressed files for you when you try to access their content, but it will not (re)compress them.
@RJVB I need a file in filesystem-level compression with lzfse use this tool or others, what could I do?
@lilin007007 I think you need another tool: https://github.com/lzfse/lzfse
@lilin007007 @RJVB there's also a built in tool.
See: https://www.keolo.com/blog/post/terminal-commands/
Compressing a single file using LZFSE
compression_tool -encode -i File.txt -o File.lzfse
Decompressing a single file using LZFSE
compression_tool -decode -i File.lzfse -o File.txt
Compressing a folder using LZFSE
yaa archive -d Folder -o File.yaa
Decompressing a folder using LZFSE
yaa extract -i File.yaa -d Folder
@gingerbeardman they all can not work, I tested these. what is the yaa or compression_tool?I can not found them.
They are built-in MacOS command line tools, use them in Terminal. I'm on HighSierra and they work for me.
@gingerbeardman Thank you.but They are not the system-level compression.I need a tool like ditto.
Maybe you should tell us more about what you need it for, and why you need lzfse compression?
@RJVB I need some files for learn the format of the LZFSE on disk
If you want to study lzfse itself you would be better off installing the lzfse reference implementation from https://github.com/lzfse/lzfse . Studying the lzfse format on disk will be a lot easier that way because you can access the compressed content with standard tools like od . That won't be possible with filesystem level compression.
If what you are interested in is the actual bytes stored on disk after applying HFS filesystem compression then you can also get very far already with the current afsctool and the reference LZFSE implementation. A HFS-compressed file can take two forms, regardless of the compression type being used:
lzfse -encode -i foo -o foo.lzfse
. The only difference may be that the foo.lzfse has a small header ("magic") that identifies it as having been compressed by lzfse (and not by gzip, for instance), plus a small trailer that marks the end of the compressed data. Matt may be able to say if those magic words are included in HFS compression or not.I'm attaching an example lzfse-compressed file (replace the .txt extension with .lzfse after downloading!). You can see that it starts with the magic word bvx2
and ends with bvx$
.
example.txt
@gingerbeardman could you please attach a small DMG containing a file that uses compression type 7 and one using compression type 8, made on a recent OS? If you want add types 11 and 12 too, just to be exhaustive..
Here's one containing 7&8: 7and8.dmg.zip
One .plist and one .xml file, both taken from Safari Technology Preview Release 64 (Safari 12.1, WebKit 13607.1.3.3) on macOS 10.13.6 (17G65)
Just trying to think of a way to search for any files using 11 & 12...
What kind of filesystem is on that dmg, APFS? I get a "no mountable filesystems" error...
Ah, yes, APFS is the default. Let me fix/redo it.
Earlier download is now updated.
Thanks, this one works. And indeed I can read the files on OS X 10.9 :)
So... beta-testers required!
I have an initial implementation of LZVN support, but for now it works only up to a certain file size. I got the impression that the format didn't use chunking, but apparently it does, and I have no idea how. It doesn't work to use the same chunking approach as with ZLIB (i.e. compress 64Kb chunks); the 8.xml file from Matt's example dmg uses a single chunk, for instance. I'm going to need a few larger LZVN-hfs-compressed example files to try and make sense of that.
This requires an external LZVN implementation, preferably my fork (github:RJVB/lzvn) because that has a few fixes.
Of course I will be happy to test. The least I can do seeing as I got you into this mess 😉
I'll package up a range of 7&8 files with different sizes.
I'll zip up a range of 7&8 files with different sizes.
Type 7 files won't be necessary, probably; check first. Remember, the choice between types 7 and 8 is made automatically.
No hurries, btw.
It's easy for you to get them, I just try using afsctool to decompress the latest Safari Technology Preview app.
But I should be able to get to it tomorrow.
It's easy for you to get them
Get them, easy, use them, impossible. The installer detects my OS is too old, and bails. And I don't trust the extraction utilities I have to preserve HFS compression.
I've been looking at the sleuthkit sources which have (proper!) implementations of zlib and lzvn decompression. The example file I had was a very special case where the table of chunk offsets looked just like a simple header of a non-chunked data buffer :-/
I think I beat you to it :P
After squinting a bit (very much) at the sleuthkit code and mumbling to myself how I'd add proper lzvn compression without rewriting all of the compressFile() function I think I have now nailed it. Really, this time; cmp
reports no differences between compressed copies and the originals for
I do see this in the System.log:
Sep 17 20:09:31 Portia kernel[0]: decmpfs.c:1386:decmpfs_read_compressed: cluster_copy_upl_data err 14
Sep 17 20:09:31 Portia kernel[0]: decmpfs.c:1409:decmpfs_read_compressed: err 14
I have no idea what that means; it only happens with files that are well over 64Kb (but I haven't determined the exact size).
Great! Sorry I ended up driving most of yesterday.
You can use an app called Pacifist to extract single files out of Mac pkg/installers.
I've also been looking into sleuthkit. How do you find it?