Closed m-lw closed 7 months ago
ClamAV is removing the soft-break during normalization. The 'problem' is that it is then also converting everything to lowercase, so "DocuSign" is becoming "docusign", which doesn't match the first logical signature.
Your Test3 signature does match because one of the lines in the original email has 'abcd' unbroken, along with the camelcase DocuSign.
If you add ::i
to end of each of the first subsignatures, all the signatures do end up matching because it is no longer looking explicitly for the camelcase.
~/Security$ clamscan -d quoted-printable.ldb --no-summary -z quoted-printable.eml
~/Security/quoted-printable.eml: Test1.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test2.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test3.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test1.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test2.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test3.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test3.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test3.UNOFFICIAL FOUND
~/Security/quoted-printable.eml: Test3.UNOFFICIAL FOUND
~/Security$ cat quoted-printable.ldb
Test1;Engine:81-255,Target:0;0&1;446f63755369676e::i;616263642e696f2f61626364
Test2;Engine:81-255,Target:0;0&1;446f63755369676e::i;616263642e
Test3;Engine:81-255,Target:0;0&1;446f63755369676e::i;61626364
For reference, these are the files that get created by the different normalizations that occur while scanning that message:
~/Security/20220304_130643-quoted-printable.eml.4cc5473e67/quoted-printable.eml.58567c8191$ for f in `find . -type f`; do echo "== $f =="; cat $f; echo; done
== ./textportion.84662458bb/html-tmp.45d0147f18/notags.html ==
https://abcd.io/abcd/x#x docusign
== ./textportion.84662458bb/html-tmp.45d0147f18/nocomment.html ==
<html><head><title></title></head><body><a href="https://abcd.io/abcd/x#x">docusign</a></body></html>
== ./clamav-305ac395b94a192af4df0026248a2883.tmp.0356e26b8c/html-tmp.ff53d1161f/notags.html ==
https://a bcd.io/abcd/x#x docusign
== ./clamav-305ac395b94a192af4df0026248a2883.tmp.0356e26b8c/html-tmp.ff53d1161f/nocomment.html ==
<html><head><title></title></head><body><a href="https://a bcd.io/abcd/x#x">docusign</a></body></html>
== ./mail-tmp.759b8bbb97/clamav-305ac395b94a192af4df0026248a2883.tmp ==
<HTML><HEAD><TITLE></TITLE></HEAD>
<body>
<A href="https://a
bcd.io/abcd/x#x">
DocuSign</A>
</BODY></HTML>
== ./mail-tmp.759b8bbb97/clamav-45d2dba2bdf8852ab9fc5250097d7ed7.tmp ==
<HTML><HEAD><TITLE></TITLE></HEAD>
<body>
<A href="https://abcd.io/abcd/x#x">
DocuSign</A>
</BODY></HTML>
There's definitely something odd going on because if you use --normalize=no
on the clamscan command-line, all 3 signatures will match correctly. It seems like the original file (or the reconstructed mail file) doesn't get scanned if normalization is allowed.
Here are some logical signatures with two subsignatures that fail to match a quoted printable email where one of the lines is split with a soft line break ('=' at the end of a line).
Scan quoted-printable.eml with the signatures quoted-printable.ldb from quoted-printable.zip:
It only matches the Test3 rule, but should match Test1 and Test2 as well.
Running it without the soft line break matches all three rules as expected:
This is the output of
clamconf -n
: