onekey-sec / unblob

Extract files from any kind of container formats
https://unblob.org
Other
2.11k stars 80 forks source link

fix(processing): delete successfully processed files. #766

Closed qkaiser closed 5 months ago

qkaiser commented 5 months ago

Delete the source file after extraction if the extraction was successful and the chunk being extracted covers the whole file.

Technical notes: A call to extract() does not return anything if everything went well (no unhandled exception, no extraction errors). Under those conditions, we could delete the source file if the --keep-extracted-chunks option is not set.

Relying on keep-extracted-chunks to make this decision is not ok since we're not technically operating on a chunk here. A chunk is something that was carved out of a file, here we're operating on a file that was extracted from an archive/filesystem.

Closely related to https://github.com/onekey-sec/unblob/issues/326, triggered by https://github.com/onekey-sec/unblob/discussions/687

martonilles commented 5 months ago

I would default to not deleting source files, a we can easily loose information and structure while doing so (there might be a reference to a file , which is further extracted and hence deleted).

Even though not sure we would benefit much by deleting files compared to the complexity

qkaiser commented 5 months ago

https://github.com/onekey-sec/unblob/discussions/687#discussioncomment-8494360