Open mehrabiworkmail opened 8 months ago
I prefer to avoid compiler flags for cases like this: It makes testing much more complex (do we now need to build and test both versions?) and means that only the few people who are able to compile their own library can benefit.
There is precedent elsewhere in libarchive for skipping unrecognized or malformed data with a warning. I think that would be the better approach here. (Obviously, the best answer is to implement support where possible.)
I have created a pull request for this issue that contains my kAdditionalHeaderInfo skip code (just in case someone is interested). It is closed for now as I intend to implement support for the header instead. Will reopen the pull-request once it is done.
When attempting to use libarchive to extract a corpus of malicious SFX files, I found that libarchive fails to extract many SFX files that the 7z tool can extract just fine. I investigated the issue and found the following to be one of the causes.
After reading
archive_read_support_format_7zip.c
, I realized that libarchive currently does not support the following:kAdditionalStreamsInfo
header type (and extracting data from that stream, for that matter)kName
values fromFilesInfo
that haveexternal
flag set (i.e., the data is external to the header)when the
external` flag is setUnfortunately, it seems that when libarchive encountes one such data/header, it simply gives up processing the file, even if it is able to extract data from the main stream just fine. This might not tbe desired behaviour, depending on the application (e.g., for processing malicious files it is often desirable to extract as much info as possible, even if it means skipping some unsupported files)
Obviously, the ideal solution is to implement support for these header/data types. However, a workaround is to make libarchive skip such data/header types, but only if the users so chooses by setting specific compiler flags. Below are screenshots of such a solution I implemented.
Firstly, as the
kAdditionalStreamsInfo
header has basically the same structure as thekMainStreamsInfo
header, we can reuse the latter's code to read the former:Similarly, libarchive already has partial support for skipping external data in the
read_Times
function in the same file (archive_read_support_format_7zip.c:2753). So, we can simply reuse the same code to skip external values in other places:Here's the git patch for my solution: 0002-7z-optional-compiler-flag-to-skip-additonalstreamsin.patch
Please let me know if you have any feedback or other suggestions.