Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net
https://www.clamav.net/
GNU General Public License v2.0
4.03k stars 675 forks source link

ClamAV cannot handle .7z with BZip2 contents #542

Open HenkPoley opened 2 years ago

HenkPoley commented 2 years ago

This issue has been sent to Cisco PSIRT before, handled as PSIRT-0722824177, but deemed not a security issue.

It is currently used by malware spammers to evade archive scanning. A similar issue might exist with Lzip.

Describe the bug

ClamAV updated to the 7zip 9.20 C source code back in 2011. Meanwhile 7zip has added support for the BZip2 compression algorithm in the 7z container format. ClamAV currently lacks support for BZip2 in the 7z container. A couple of filter methods to improve compression of executables also seem to be missing.

This can be used to evade archive scanning. For example, this in the wild sample from an email: https://www.virustotal.com/gui/file/7aa6ae387fbd82b552280466f730efa88c6ac2ee36ad1932557675f571bfe902/detection

This simple Windows .lnk + 'powershell.exe' in UTF16BE signature will not detect the .lnk file inside the the above Method = BZip2 compressed 7zip:

Windows_LNK_magic_bytes_with_powershell.exe.942:0:0:4c00000001140200*0070006f007700650072007300680065006c006c002e006500780065

Relevant clamscan --debug -d lnk.ndb sample.7z output:

LibClamAV debug: cli_7unz: extracting ONEYHANC02055500_draft_20220117688077/ONEYHANC02055500_draft_20220117688077.lnk
LibClamAV debug: CDBNAME:CL_TYPE_7Z:0:ONEYHANC02055500_draft_20220117688077/ONEYHANC02055500_draft_20220117688077.lnk:0:2568:0:1:3847976810:(nil)
LibClamAV debug: cli_unz: extraction failed with 4
LibClamAV debug: cli_7unz: extracting ONEYHANC02055500_draft_20220117688077/ONEYHANC02055500_draft_20220117688077.pdf
LibClamAV debug: CDBNAME:CL_TYPE_7Z:0:ONEYHANC02055500_draft_20220117688077/ONEYHANC02055500_draft_20220117688077.pdf:0:25600:0:2:2474246062:(nil)
LibClamAV debug: cli_unz: extraction failed with 3
LibClamAV debug: cli_7unz: crc mismatch

How to reproduce the problem

Create a 7zip file with a ClamAV detected malware file, using using BZip2. Or at least a compression method different from LZMA/LZMA2, PPMD(, or Copy/Store).

micahsnyder commented 2 years ago

Hi @HenkPoley,

Yup we agree with Cisco PSIRT that it isn't a security issue. It's more like a feature request that will improve the detection efficacy.

It might be that the 7z container with BZip2 support is only coded in the C++ part of 7zip: https://github.com/kornelski/7z/blob/999cf2599051f70fd92893363384243b1633fc75/CPP/7zip/Archive/7z/7zHeader.h#L113

While our vendored LZMA-SDK source is really dated, this is the real issue. The C version doesn't include some features including bz2 compression support. We will have to swap the C veersion for the C++ version to get the bz2 support.

We have ticket in our internal Jira backlog to track this issue as well, so I'll link this issue in there. For internal reference: CLAM-1635

Thanks for the report, Micah

HenkPoley commented 2 years ago

Just a follow up question, since it was a bit "hidden" inside the ticket.

Are you also tracking LZIP support? It's a lot more obscure archive format, but I'm seeing malware being emailed in that compression format. To be fair currently only 6 AVs do client side LZIP scanning (ESET-NOD32, Kaspersky, Lionic, Microsoft, ZoneAlarm by Check Point, Zoner). So ClamAV would be early on the scene 😅

micahsnyder commented 2 years ago

@HenkPoley We're not tracking LZIP support. I honestly hadn't heard of it before. A quick search shows it's 14 years old, so not that early on the scene 😅.

There are no references to lzip inside the C++ LZMA-SDK source. A request for LZIP support would then be a completely separate Github issue (if you feel there is a strong need for this), but based on the reaction I got from asking our malware research team about it - I don't think it's something they will push for.

HenkPoley commented 2 years ago

Essentially some AVE_MARIA sender is stuffing everything through convertio for their "tender" and "invoice" emails . Their malware tarballs have 'convertio' as username/group 🙈.

Convertio supports LZIP. So the malware peddlers try that. I've put up some filters to detect LZIP signs, and indeed it doesn't seem too popular. But it seems a bit silly to have known malware slip by due the compression format.

Another good option is for the Cisco-Tales team to get a tap on Convertio's firehose 👍

I understand that it would be another issue than 7zip 🤭