Closed scottpas closed 1 year ago
Here's another example file:
12361b94bae2da00f0215d8a22674066dd4198d3c5795c3dfdad605b3a15ffb5 (on MalwareBazaar)
It's a MSI file, which contains a CAB with an embedded malicious DLL. The CAB and DLL aren't extracted, even with Deep Scan enabled. I also enabled continue_after_extract
and extract_executable_sections
.
hachoir-subfile /tmp/12361b94bae2da00f0215d8a22674066dd4198d3c5795c3dfdad605b3a15ffb5.msi
[+] Start search on 861184 bytes (841.0 KB)
[+] File at 0 size=861184 (841.0 KB): Microsoft Office document
[+] File at 61888 size=318 (318 bytes): Microsoft Windows icon: 16x16x0
[+] File at 62208 size=318 (318 bytes): Microsoft Windows icon: 16x16x0
[+] File at 77312 size=105056 (102.6 KB): Microsoft Bitmap version 3
[+] File at 188928 size=671943 (656.2 KB): Microsoft Cabinet archive
Here's another example file:
12361b94bae2da00f0215d8a22674066dd4198d3c5795c3dfdad605b3a15ffb5 (on MalwareBazaar)
It's a MSI file, which contains a CAB with an embedded malicious DLL. The CAB and DLL aren't extracted, even with Deep Scan enabled. I also enabled
continue_after_extract
andextract_executable_sections
.
Oletools was able to extract the CAB and then Extract was able to extract the DLL.
Example hash: c7dd490adb297b7f529950778b5a426e8068ea2df58be5d8fd49fe55b5331e28
hachoir-subfile output:
... When running this same file through AL, the PNG file is not extracted. An extracted ole object contains the png file, but the image itself does not appear anywhere in the AL output.
DocPreview was able to create a render which coincidentally matches the PNG extracted using hachoir-subfile (but AL wasn't able to extract the PNG in question):
While I do see value in enhancing the service with a tool that's able to extract subfiles from the original file's stream, I'm not sure if this lib does a stellar job.
For instance, I was hoping it would extract the image from this simple Word doc but it seems to interpret the entire file as a ZIP: d1720ff15ba5a134415875a35cbe203777cf389d87f6a79aacc801ea543cddae
from hachoir.subfile.search import SearchSubfile
from hachoir.stream import FileInputStream
stream = FileInputStream("d1720ff15ba5a134415875a35cbe203777cf389d87f6a79aacc801ea543cddae")
subfile = SearchSubfile(stream, 0, None)
subfile.loadParsers(None,None)
subfile.setOutput('pop')
subfile.main()
[+] Start search on 1210982 bytes (1.2 MB)
[+] File at 0 size=1210982 (1.2 MB): ZIP archive (don't copy whole file)
[+] End of search -- offset=1210982 (1.2 MB)
True
But if you come across any samples of interest or find any tooling that would be better to integrate with, I'd be happy to investigate! 😀
@scottpas thoughts?
Going to close issue for now. If there's still interest in this, feel free to reopen 😀
Example hash: c7dd490adb297b7f529950778b5a426e8068ea2df58be5d8fd49fe55b5331e28
hachoir-subfile output:
When running this same file through AL, the PNG file is not extracted. An extracted ole object contains the png file, but the image itself does not appear anywhere in the AL output.
Perhaps this could be a deep scan feature, since it may add a lot of artifacts that people may not care much about.