Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net
https://www.clamav.net/
GNU General Public License v2.0
4.22k stars 687 forks source link

Failure extracting base64 encoded image attached to email in HTML CSS #1323

Open JAF84 opened 1 month ago

JAF84 commented 1 month ago

ClamAV failed to extract a base64 encoded image attached to email in HTML CSS.

The attached mail.zip contains two files:

The difference is a newline char before the "base64".

br Johannes

micahsnyder commented 1 month ago

Hi Johannes,

I reviewed the contents of mail.zip and see two HTML files:

It looks to me like you're reporting an issue with ClamAV failing to extract a PNG file embedded in the HTML using base64 encoded CSS. We'd added support for extracting that in ClamAV 1.1.

part.html is the one that fails to extract the image, while part2.html correctly extracts it. The difference is that part.html has some whitespace (a new line, which is normalized into a single space) in the mime arguments.

The diff of the two files shows it clearly (note that part2 is on the left): image

The clamscan --debug output also shows where this fails, because the mime argument has that space in it: image

We'll need to add some logic in there to strip any whitespace in the mime args. I think that'll fix it. The code in question is right here in mbox.c: image

I don't have time to work on this right now as I'm fighting other fires. Going to mark this as a bug for now.

JAF84 commented 1 month ago

Hello Micah,

thank you, yes this is exact the issue.

part.html was to original/bad-mail-file part2.html was a modification of me, just for demonstration about the issue how it should work.

br johannes