fkie-cad / fact_extractor

Standalone Utility for FACT-like extraction
GNU General Public License v3.0
80 stars 31 forks source link

Add option to exclude files based on their path #50

Closed JRomainG closed 4 years ago

JRomainG commented 4 years ago

This adds an exclude option to the unpacker to exclude files based on their path. A list of blob patterns can be provided, and files matching any of those patterns won't be unpacked or given in the output.

I found this useful when unpacking different files with the same structure, in which I knew some folders could be ignored, which saved a fair amount of time.

As it doesn't make too much sense to provide a default value for this option, I didn't include it in main.cfg. However, if this is merged, I think it would be best to add an entry in the wiki to mention its existence and behavior.

JRomainG commented 4 years ago

The changes you mentioned should be implemented, I'll work on adding some tests and fixing the conflicts with upstream

jstucke commented 4 years ago

I tried running the tests in your branch but I get some errors, e.g.

>           get_unpack_status(file_path, binary, extracted_files, meta_data, self.config)
E           UnboundLocalError: local variable 'binary' referenced before assignment

unpacker/unpack.py:48: UnboundLocalError

Also there seems to be a lot of debugging output (I'm actually not sure where this is coming from)

JRomainG commented 4 years ago

It seems like I screwed up when merging master into this branch though the Github interface, I guess I should have just done it locally. This resulted in duplicating some lines that were moved during the merge of #49.

Regarding the additional debug output, I'm not too sure where this is coming from either, the changes aren't supposed to add any logging. Do you have any example of what's being logged?

jstucke commented 4 years ago

I did a bit of digging and the logging output seems to be related to pytest in combination with pytest-flake8 and failing tests.

Two tests are still failing. They need to be adjusted for the new metadata field:

________________________________________ TestUnpackerPluginPostscript.test_extraction ________________________________________

self = <plugins.unpacking.xerox.test.test_postscript.TestUnpackerPluginPostscript testMethod=test_extraction>

    def test_extraction(self):
        files, meta_data = self.unpacker.extract_files_from_file(TEST_FILE, self.tmp_dir.name)
        self.assertEqual(meta_data['plugin_used'], 'Postscript', 'wrong plugin selected')
        self.assertEqual(meta_data['Title'], 'Firmware Update', 'meta data not set correctly')
        self.assertEqual(meta_data['ReleaseVersions'], 'vx=10.80,ps=4.19.0,net=44.38,eng=26.P.1.4.19.0')
        self.assertEqual(meta_data['encoding_overhead'], 0.25, 'encoding overhead not correct')
>       self.assertEqual(len(meta_data.keys()), 10, 'number of found meta data not correct')
E       AssertionError: 11 != 10 : number of found meta data not correct

fact_extractor/plugins/unpacking/xerox/test/test_postscript.py:26: AssertionError
___________________________________________ TestUnpackerPluginZlib.test_extraction ___________________________________________

self = <plugins.unpacking.zlib.test.test_plugin_zlib.TestUnpackerPluginZlib testMethod=test_extraction>

    def test_extraction(self):
        in_file = os.path.join(TEST_DATA_DIR, 'test.zlib')
        files, meta_data = self.unpacker.extract_files_from_file(in_file, self.tmp_dir.name)
        self.assertEqual(len(files), 1, 'number of extracted files not correct')
        self.assertEqual(files[0], os.path.join(self.tmp_dir.name, 'zlib_decompressed'), 'file name not correct')
        file_binary = get_binary_from_file(files[0])
        file_hash = get_sha256(file_binary)
        self.assertEqual(file_hash, 'e429103649e24ca126077bfb38cce8c57cc913a966d7e36356e4fe0513ab02c4')
>       self.assertEqual(len(meta_data.keys()), 3, 'more or fewer than standard keys in meta dict')
E       AssertionError: 4 != 3 : more or fewer than standard keys in meta dict

fact_extractor/plugins/unpacking/zlib/test/test_plugin_zlib.py:24: AssertionError
codecov[bot] commented 4 years ago

Codecov Report

Merging #50 into master will increase coverage by 0.10%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #50      +/-   ##
==========================================
+ Coverage   88.98%   89.09%   +0.10%     
==========================================
  Files         116      116              
  Lines        3305     3338      +33     
==========================================
+ Hits         2941     2974      +33     
  Misses        364      364              
Impacted Files Coverage Δ
fact_extractor/helperFunctions/statistics.py 100.00% <100.00%> (ø)
...or/plugins/unpacking/xerox/test/test_postscript.py 100.00% <100.00%> (ø)
...or/plugins/unpacking/zlib/test/test_plugin_zlib.py 100.00% <100.00%> (ø)
fact_extractor/test/unit/unpacker/test_unpacker.py 100.00% <100.00%> (ø)
fact_extractor/unpacker/unpack.py 92.42% <100.00%> (+0.62%) :arrow_up:
fact_extractor/unpacker/unpackBase.py 96.38% <100.00%> (+0.79%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d458dc0...f1ee677. Read the comment docs.