log2timeline / dfvfs

Digital Forensics Virtual File System (dfVFS)
Apache License 2.0
204 stars 45 forks source link

Improve archive file support #86

Open joachimmetz opened 8 years ago

joachimmetz commented 8 years ago

Add Improve archive file support for:

pombredanne commented 7 years ago

FYI, I have a fairly extensive support for archive extraction through a libarchive ctypes binding and 7zip subprocesses and the Python stdlib here: https://github.com/nexB/scancode-toolkit/tree/develop/src/extractcode It may or may not be useful to you and it may or give you some ideas.

joachimmetz commented 7 years ago

@pombredanne thanks I'll give it a look

joachimmetz commented 3 years ago

https://github.com/libarchive/libarchive The license is a bit messy, mainly BSD 2-clause, some BSD 3-clause, public domain.

Then there appear to be 3 different projects that provide Python bindings

Fedora and Ubuntu ship with python3-libarchive-c unfortunately this is public domain, no FOSS license, so treat it as an optional dependency for now.

It support reading from file-like objects

f = open('syslog.zip', 'rb')
with libarchive.stream_reader(f) as r:
    for e in r:
        print(e)

Looks like a seekable_stream_reader was added very recently https://github.com/Changaco/python-libarchive-c/blame/master/libarchive/read.py#L142