sepinf-inc / IPED

IPED Digital Forensic Tool. It is an open source software that can be used to process and analyze digital evidence, often seized at crime scenes by law enforcement or in a corporate investigation by private examiners.
Other
939 stars 218 forks source link

Export files to a more forensic sound format in triage mode #1715

Open lfcnassif opened 1 year ago

lfcnassif commented 1 year ago

As explained on https://github.com/sepinf-inc/IPED/issues/1714#issuecomment-1591259306 ZIP is not the best format to collect files in triage situations.

One option is exporting to AD1, since it was reverse engineered by @gfd2020 and we already have reading code for it. Writing code should be implemented...

Another option is using the AFF format, which specification is open. I think I already saw a java library to handle it, but AFAIK it doesn't support all features and I don't remember if writing is supported.

lfcnassif commented 1 year ago

This is the official AFFv4 repo: https://github.com/aff4

The java implementation is read only. The python implementation says the logical image support is new, but below it says the writing support is broken (not sure if it is related to physical images, logical images or both...)

lfcnassif commented 1 year ago

https://www.dmares.com/maresware/articles/copy_that.htm this and other articles in this website are very good about forensic zip files. I talked with the author few years ago. In summary, winrar is the best to keep metadata. But I saw that 7zip with .win format and some parameters can keep good information: example "C:\Program Files\7-Zip\7z.exe" a -twim -sccUTF-8 -bb0 -bse0 -bsp2 -spf -ssp -sns "d:\test-folder\test.wim" "f:\folder*"

Thanks @rafael844, that's useful info. I'm replying in the proper issue.

I did some searches and found bgzip format can be an option, since it allows random seek into compressed data: http://www.htslib.org/doc/bgzip.html

There is also a java reader implementation: https://github.com/vivimice/bgzf-randreader But I haven't found a writer one...

About storing files metadata, if we are going to rely on a non standard format (like AD1), another option is to create our own format (that I would like to avoid, if possible...) and borrow an AFF4 idea to write into the container a specific file to store metadata about acquired files.

lfcnassif commented 1 year ago

Just found a java reader and writer impl for bgzip: https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/util/BlockCompressedInputStream.java https://github.com/samtools/htsjdk/blob/master/src/main/java/htsjdk/samtools/util/BlockCompressedOutputStream.java

License seems ok.

rafael844 commented 1 year ago

In the end of the article that I sent in the link above, they talk about Alternate Data Streams. Winrar and 7zip (.win) can handle it, I dont know if bgzip can do it, or even if its important to have it on IPED, but its good to know.

lfcnassif commented 1 year ago

In the end of the article that I sent in the link above, they talk about Alternate Data Streams. Winrar and 7zip (.win) can handle it, I dont know if bgzip can do it, or even if its important to have it on IPED, but its good to know.

bgzip and ordinary zip files can support ADS since it is possible to put into the ZIP several items with the exact same name and path (although IPED puts a suffix to ADS streams). Windows Explorer gives an error opening zip files having 2+ entries with the same name/path, but other tools, like 7zip and IPED, open fine.

rafael844 commented 1 year ago

Great. That I remembered, to compress a file keeping the ads, wich could contain hidding text, with 7zip, we had to use that .win format. With winrar we have to set cmd parameters or in advance config: image

Other softwares like winzip doesnt keep it. Only forensic ones, that I know, such as ftk or encase. For reading/decompress maybe all work, but to compress I dont know too much.