onekey-sec / unblob

Extract files from any kind of container formats
https://unblob.org
Other
2.16k stars 80 forks source link

fix(handlers): handle dangling symlinks in MultiFile handlers. #773

Closed qkaiser closed 7 months ago

qkaiser commented 7 months ago

MultiFile handlers would collect files within a directory corresponding to a specific schema without checking if those files are actually present.

For example, a directory could contain dangling symlinks with a name corresponding to the glob search. This would lead to FileNotFoundError being thrown by the multi-file handlers.

Resolve #771

AndrewFasano commented 7 months ago

With this patch (testing on the same FW from #771) I get a new error:

2024-02-15 19:36.16 [warning  ] Unhandled Exception during multi file calculation handler=multi-gzip path=FW_RT_N66U_C1_300438510000.zip_extract/Firmware_Release/RT-N66U_C1_3.0.0.4_385_10000-gd8ccd3c.trx_extract/part1_extract/0-40448000.squashfs_v4_le_extract/www/fb_data.tgz.gz.part.d pid=65 severity=<Severity.ERROR: 'ERROR'>
Traceback (most recent call last):
  File "/unblob/unblob/processing.py", line 377, in _calculate_multifile
    return dir_handler.calculate_multifile(path)
  File "/unblob/unblob/handlers/compression/gzip.py", line 176, in calculate_multifile
    if file != paths[0]:
IndexError: list index out of range
AndrewFasano commented 7 months ago

With these changes plus the following check to fix the new error, the test FW extracts correctly.

--- a/unblob/handlers/compression/gzip.py
+++ b/unblob/handlers/compression/gzip.py
@@ -170,6 +170,9 @@ class MultiVolumeGzipHandler(DirectoryHandler):
             [p for p in file.parent.glob(f"{file.stem}.*") if p.resolve().exists()]
         )

+        if not len(paths):
+            return None
+
         # we 'discard' paths that are not the first in the ordered list,
         # otherwise we will end up with colliding reports, one for every
         # path in the list.
qkaiser commented 7 months ago

Good catch, did not think of it.

qkaiser commented 7 months ago

Added integration tests files reproducing the situation described in #771