RJVB / afsctool

This is a version of "brkirch"'s afsctool utility that allows end-users to leverage HFS+ compression.
https://brkirch.wordpress.com/afsctool
GNU General Public License v3.0
190 stars 18 forks source link

unknown compression types 7 & 8 #6

Closed gingerbeardman closed 6 years ago

gingerbeardman commented 6 years ago
  1. Install Safari Technology Preview on macOS High Sierra https://developer.apple.com/safari/technology-preview/
  2. Copy app to test directory
  3. Terminal: $ afsctool -d "Safari Technology Preview.app"/
/Users/matt/Downloads/2017-11-27/test/Safari Technology Preview.app//Contents/Frameworks/JavaScriptCore.framework/Versions/A/Resources/jsc: Decompression failed; unknown compression type 8
/Users/matt/Downloads/2017-11-27/test/Safari Technology Preview.app//Contents/Frameworks/JavaScriptCore.framework/Versions/A/Resources/Info.plist: Decompression failed; unknown compression type 7
RJVB commented 6 years ago

thanks for the info... I can't do much with it though. Apparently Apple added support for new types of compression, I was unaware of that. But without a test system and exact knowledge how to decompress I'm not even going to try to implement support. Too risky.

To be honest, decompression is really the least interesting feature. You can do that also by simply copying the file or directory with a command that doesn't preserve the compression.

gingerbeardman commented 6 years ago

You're right, feel free to close

gingerbeardman commented 6 years ago

I found a note about type 8 here: http://newosxbook.com/tools/hfsleuth.html#DOWNLOAD

11/11/2016

  • Development resumed,
  • support for file compression (type 4, zlib) added
  • support for file compression type 8 (for MacOS compression as of 10.10)
RJVB commented 6 years ago

Last time I looked at other compression types I got the distinct impression that they're not of interest to end users, except possibly for decompression (but remember my earlier comment about that).

I also can't do much with references that don't provide source code!

gingerbeardman commented 6 years ago

Yes, it was meant more as a note to collect more info on this

RJVB commented 6 years ago

Ah, sorry I did mean to post the source link: http://newosxbook.com/files/hfsleuth.tar

I saw that link, but the archive only contains binaries and a man page.

MAYBE type 8 compression is bzip2-based (which would be useful)?

gingerbeardman commented 6 years ago

i edited my comment to remove the hfsleuth download: it is not open source.

Yes, I think type 8 is another form of compression, and I am almost finished putting a table together.

Some types are Apple's own open-source LZFSE: https://github.com/lzfse/lzfse (macOS 10.11 and iOS 9 onwards)

My thinking is that may provide better/faster compression than zlib?

RJVB commented 6 years ago

My thinking is that may provide better/faster compression than zlib?

That's the claim. Apple can make that true because its implementations are apparently tightly tuned for the hardware.

gingerbeardman commented 6 years ago

ALL are supported on both HFS+ and APFS

10.6 (mid 2009)

1 /* No compression; in xattr */ 2 /* (unused) */ 3 /* ZLIB; in xattr */ 4 /* ZLIB (64K chunked); in rsrc fork */ 5 /* (specifies de-dup within the generation store) */ 6 /* (unused) */

10.10 (mid 2014) + 10.9.5 backport (late 2014)

7 /* LZVN; in xattr */ 8 /* LZVN (64K chunked); in rsrc fork */

10.11 (mid 2015) + 10.10.2 backport (early 2015)

9 /* uncompressed; in xattr */ 10 /* uncompressed (64K chunked); in rsrc fork */ 11 /* LZFSE; in xattr */ 12 /* LZFSE; in rsrc fork */

other

0x80000001 /* faulting file: deprecated */

LZFSE also used for compression of DMGs. man hdiutil [macOS 10.11; 2015]

References:

RJVB commented 6 years ago

A quick test with lzfse on a typical big shared library showed slightly better and much faster compression compared to gzip and even CloudFlare's accelerated zlib.

However,

7 / LZVN in xattr / 8 / LZVN in rsrc fork / 9 / DECMPFS_TYPE_RAW_ATTR (LZFSE) / 10 / DECMPFS_TYPE_RAW_RSRC (LZFSE) /

Could you add the OS versions which add support for the various types and also if they're supported in HFS or only in APFS?

I hesitate to add compression support for types that aren't supported across the board. Decompression is another matter, but I'm a bit surprised that Apple would make it necessary to turn to 3rd party tools in order to read files on a HFS+ volume that was used on a more recent OS version. If those "new" types are in fact APFS specific that would change things - a bit.

I'm not exactly looking forward to diving in and adding (de)compressors, but suggestions which types to support and how (command line options, or automatic choice of an optimal algorithm, etc) welcome.

ps: LZFSE also used for compression of DMGs. man hdiutil

From what OS version?

Thanks for the link to the apfs-fuse repo, btw. Could be useful.

gingerbeardman commented 6 years ago

sure, i've edited the above table. will continue to do so

gingerbeardman commented 6 years ago

LZFSE was backported from 10.11.0 to 10.10.2 (Jan 2015)

I'm just checking whether LZVN was also backported.

gingerbeardman commented 6 years ago

OK, I am quite confident that the versions/dates above are accurate.

There was no backporting to 10.8

RJVB commented 6 years ago

LZFSE was back-ported from 10.11.0 to 10.10.2 (Jan 2015)

I want to support at least down to 10.9 (which I'm running myself).

I'm just checking whether LZVN was also back-ported.

Isn't that an compression protocol implemented in the Mach kernel? The description in the repo you pointed too is rather vague.

gingerbeardman commented 6 years ago

Yes, LZVN is used to compress the kernel

gingerbeardman commented 6 years ago

https://opensource.apple.com/source/copyfile/copyfile-138/copyfile.c.auto.html

Boom! A pretty complete list. I'll update my list here tomorrow.

RJVB commented 6 years ago

Yes, LZVN is used to compress the kennel

Well, in that case ... it can go to the doghouse! ;)

RJVB commented 6 years ago

Boom! A pretty complete list. I'll update my list here tomorrow.

Kudos ... I hate reading Apple code, and there's a LOT of it in that file (but judging from the name I guess it does everything that could interest me) ...

gingerbeardman commented 6 years ago

I can't take the credit, somebody else dug up the link to that code.

But it would be great if the "best" compression type for each OS could be supported? Do you think that you might plan to do that?

RJVB commented 6 years ago

I can't take the credit, somebody else dug up the link to that code.

Finding the code is one thing, extracting the information from it another.

But it would be great if the "best" compression type for each OS could be supported? Do you think that you might plan to do that?

I've been thinking about decompression support in the back of my head, and about starting to prepare the code for additional compression types (with ideally runtime detection of the OS version so a single binary can work everywhere but only compress to types that are supported locally).

2 existential problems:

which lead to the bigger question whether or not it'd be really unwise to continue NOT to rewrite almost the entire code...

gingerbeardman commented 6 years ago

Table in https://github.com/RJVB/afsctool/issues/6#issuecomment-374620414 updated

RJVB commented 6 years ago

Table https://github.com/RJVB/afsctool/issues/6#issuecomment-374620414 updated

So, apparently I should be able to generate HFS/LZVN-compressed files and still read them afterwards, on 10.9.5? The presence of AppleFSCompressionTypeLZVN.kext indeed seems to confirm that.

That would make it a bit more appealing to start hacking around, esp. if the required library is already part of the system Is that indeed the case (MacPorts doesn't seem to have a port:lzvn)?

gingerbeardman commented 6 years ago

Correct, you are good for LZVN on 10.9.

The code to deal with LZVN is part of the system, but I'm not sure if it's still private and/or easily accessible. Further reading: https://pikeralpha.wordpress.com/2014/11/01/lzvn-encode/ and related blog posts.

So there is a reverse engineered version: https://github.com/Piker-Alpha/LZVN (encode/decode)

lilin007007 commented 6 years ago

I want to compress a file use lzfse on hfs+ filesystem, what cloud i do?ditto always uses zlib.I am on 10.11

gingerbeardman commented 6 years ago

I want to compress a file use lzfse on hfs+ filesystem, what cloud i do?

@lilin007007 you would have to add this feature yourself, if @RJVB is not planning to do so

lilin007007 commented 6 years ago

@gingerbeardman I need some files in lzfse format for test. how do I create these files?

RJVB commented 6 years ago

Adding new features isn't a high priority for me at the moment indeed. I am planning to reorganise the code a bit so it should become easier to add different compression formats. Summer (heat) is about over so I can start thinking about such things again :)

What I'm not really planning (yet) is upgrading my system (too much apprehension that too many things will break and I will spend weeks figuring out how to get the new OS to work the way I want it to)...

RJVB commented 6 years ago

Just to be clear: afsctool is a utility to apply or remove filesystem-level compression. It is not the tool for the job if you just want to compress a file or two, for instance to copy or send them in compressed form. The way HFS+ and APFS compression work is that you can use the compressed files as if they weren't compressed - but they lose the compression as soon as you write to them, or copy them with a traditional utility. Technically speaking: the operating system will decompress these compressed files for you when you try to access their content, but it will not (re)compress them.

lilin007007 commented 6 years ago

@RJVB I need a file in filesystem-level compression with lzfse use this tool or others, what could I do?

gingerbeardman commented 6 years ago

@lilin007007 I think you need another tool: https://github.com/lzfse/lzfse

gingerbeardman commented 6 years ago

@lilin007007 @RJVB there's also a built in tool.

See: https://www.keolo.com/blog/post/terminal-commands/

Compressing a single file using LZFSE compression_tool -encode -i File.txt -o File.lzfse

Decompressing a single file using LZFSE compression_tool -decode -i File.lzfse -o File.txt

Compressing a folder using LZFSE yaa archive -d Folder -o File.yaa

Decompressing a folder using LZFSE yaa extract -i File.yaa -d Folder

lilin007007 commented 6 years ago

@gingerbeardman they all can not work, I tested these. what is the yaa or compression_tool?I can not found them.

gingerbeardman commented 6 years ago

They are built-in MacOS command line tools, use them in Terminal. I'm on HighSierra and they work for me.

lilin007007 commented 6 years ago

@gingerbeardman Thank you.but They are not the system-level compression.I need a tool like ditto.

RJVB commented 6 years ago

Maybe you should tell us more about what you need it for, and why you need lzfse compression?

lilin007007 commented 6 years ago

@RJVB I need some files for learn the format of the LZFSE on disk

RJVB commented 6 years ago

If you want to study lzfse itself you would be better off installing the lzfse reference implementation from https://github.com/lzfse/lzfse . Studying the lzfse format on disk will be a lot easier that way because you can access the compressed content with standard tools like od . That won't be possible with filesystem level compression.

If what you are interested in is the actual bytes stored on disk after applying HFS filesystem compression then you can also get very far already with the current afsctool and the reference LZFSE implementation. A HFS-compressed file can take two forms, regardless of the compression type being used:

I'm attaching an example lzfse-compressed file (replace the .txt extension with .lzfse after downloading!). You can see that it starts with the magic word bvx2 and ends with bvx$. example.txt

RJVB commented 6 years ago

@gingerbeardman could you please attach a small DMG containing a file that uses compression type 7 and one using compression type 8, made on a recent OS? If you want add types 11 and 12 too, just to be exhaustive..

gingerbeardman commented 6 years ago

Here's one containing 7&8: 7and8.dmg.zip

One .plist and one .xml file, both taken from Safari Technology Preview Release 64 (Safari 12.1, WebKit 13607.1.3.3) on macOS 10.13.6 (17G65)

Just trying to think of a way to search for any files using 11 & 12...

RJVB commented 6 years ago

What kind of filesystem is on that dmg, APFS? I get a "no mountable filesystems" error...

gingerbeardman commented 6 years ago

Ah, yes, APFS is the default. Let me fix/redo it.

gingerbeardman commented 6 years ago

Earlier download is now updated.

RJVB commented 6 years ago

Thanks, this one works. And indeed I can read the files on OS X 10.9 :)

RJVB commented 6 years ago

So... beta-testers required!

I have an initial implementation of LZVN support, but for now it works only up to a certain file size. I got the impression that the format didn't use chunking, but apparently it does, and I have no idea how. It doesn't work to use the same chunking approach as with ZLIB (i.e. compress 64Kb chunks); the 8.xml file from Matt's example dmg uses a single chunk, for instance. I'm going to need a few larger LZVN-hfs-compressed example files to try and make sense of that.

This requires an external LZVN implementation, preferably my fork (github:RJVB/lzvn) because that has a few fixes.

gingerbeardman commented 6 years ago

Of course I will be happy to test. The least I can do seeing as I got you into this mess 😉

I'll package up a range of 7&8 files with different sizes.

RJVB commented 6 years ago

I'll zip up a range of 7&8 files with different sizes.

Type 7 files won't be necessary, probably; check first. Remember, the choice between types 7 and 8 is made automatically.

No hurries, btw.

gingerbeardman commented 6 years ago

It's easy for you to get them, I just try using afsctool to decompress the latest Safari Technology Preview app.

But I should be able to get to it tomorrow.

RJVB commented 6 years ago

It's easy for you to get them

Get them, easy, use them, impossible. The installer detects my OS is too old, and bails. And I don't trust the extraction utilities I have to preserve HFS compression.

I've been looking at the sleuthkit sources which have (proper!) implementations of zlib and lzvn decompression. The example file I had was a very special case where the table of chunk offsets looked just like a simple header of a non-chunked data buffer :-/

RJVB commented 6 years ago

I think I beat you to it :P

After squinting a bit (very much) at the sleuthkit code and mumbling to myself how I'd add proper lzvn compression without rewriting all of the compressFile() function I think I have now nailed it. Really, this time; cmp reports no differences between compressed copies and the originals for

I do see this in the System.log:

Sep 17 20:09:31 Portia kernel[0]: decmpfs.c:1386:decmpfs_read_compressed: cluster_copy_upl_data err 14
Sep 17 20:09:31 Portia kernel[0]: decmpfs.c:1409:decmpfs_read_compressed: err 14

I have no idea what that means; it only happens with files that are well over 64Kb (but I haven't determined the exact size).

gingerbeardman commented 6 years ago

Great! Sorry I ended up driving most of yesterday.

You can use an app called Pacifist to extract single files out of Mac pkg/installers.

I've also been looking into sleuthkit. How do you find it?