dtrx-py / dtrx

Do The Right Extraction
GNU General Public License v3.0
224 stars 10 forks source link

Extensibility of file types; en- and disable them for a call #55

Open phmarek opened 7 months ago

phmarek commented 7 months ago

Thanks for this tool!

It would be great if it could use additional configuration (/etc/dtrx.conf or a .d directory) to specify how to extract other kinds of data - like .jar, .odt, .docx, etc.

Then it might be useful to enable/disable recursive extraction of these for individual calls - sometimes I want to extract everything (virus scanning), other times text data (.odt) should be ignored.

phmarek commented 7 months ago

Ah, I see that it already does extract .jar. Great!

The only small problem I see is that dtrx -r -v ... only reports local filenames - eg. when extracting a .deb I get a paragraph for changelog.gz to changelog, but there's no hint which changelog (from which directory) this comes from.

That might matter if some archive contains multiple files with the same name in different directories - ie. an x.zip containing g/y.zip and h/y.zip, then the output lines (sub-structure) for the y.zips can't be associated.