sosreport / sos

A unified tool for collecting system logs and other debug information
http://sos.rtfd.org
GNU General Public License v2.0
508 stars 542 forks source link

Support extraction of flat gz/xz and archive files with sos clean. #3029

Open NikhilKakade-1 opened 2 years ago

NikhilKakade-1 commented 2 years ago

@TurboTurtle / @pmoravec , Is it possible to support extraction of flat .gz/.xz and archive files with sos clean? I see they are marked as TODO:

obvious_removes = [
            r'.*\.gz$',  # TODO: support flat gz/xz extraction
            r'.*\.xz$',
            r'.*\.bzip2$',
            r'.*\.tar\..*',  # TODO: support archive unpacking
            r'.*\.txz$',
            r'.*\.tgz$',
            r'.*\.bin$',
            r'.*\.journal$',
            r'.*\~$'
        ]

ref: https://github.com/sosreport/sos/blob/main/sos/cleaner/archives/__init__.py#L367

TurboTurtle commented 2 years ago

This TODO has been present for a while. While it is technically possible, we need to decide if we want to take on the maintenance of nested extraction and repacking. There are a few hurdles we'd need to overcome, in no particular order:

And there's more we'd find along the way in implementing this. Let's start by identifying your use case - what needs compressed file obfuscation today?

NikhilKakade-1 commented 2 years ago

yeah, I see

A few plugins support collecting additional debug logs in .zip file format. But adding to that it can be any other compression format.

The use case is straightforward it would be nice to have a mechanism in place to support the obfuscation of such files. Note: These files currently are been identified as binary files and removed from the final archive. Which can be stopped with --keep-bins-filesoption.

https://github.com/sosreport/sos/blob/a02e871e998eb0b031f35ef8db1fce800100ba09/sos/report/plugins/hpssm.py#L70 https://github.com/sosreport/sos/blob/ead8d48d12c84b88dee981efceb9c80dce7af418/sos/report/plugins/dellrac.py#L46