simsong / bulk_extractor

This is the development tree. Production downloads are at:
https://github.com/simsong/bulk_extractor/releases
Other
1.11k stars 187 forks source link

Decode lzfse and lzvn #181

Open jonstewart opened 3 years ago

jonstewart commented 3 years ago

LZFSE and LZVN are two codecs created and used by Apple. Both feature in HFS+ and APFS for transparent file compression. Based on very limited rough personal anecdata from a few years ago, around half of files on HFS+ filesystems will typically be compressed (often base install files).

A reference implementation library can be found here: https://github.com/lzfse/lzfse (3-clause BSD).

TSK has: https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/fs/lzvn.c, https://github.com/sleuthkit/sleuthkit/blob/develop/tsk/fs/decmpfs.c

It may make sense to combine decoding LZFSE/LZVN with a scanner to decode other similar codecs, provided there's some overlap in utility functions or, especially, a fast way to scan sbufs for potentially compressed data so that one pass is used to identify possible offsets for block chunks for multiple codecs—whether this is feasible will depend on the particulars.