oletools - python tools to analyze MS OLE2 files (Structured Storage, Compound File Binary Format) and MS Office documents, for malware analysis, forensics and debugging.
I think there is a few ways we could avoid false positives related to comments, one of them would be to edit the pattern to look like this:
r'(?i)^(?:[^']|\b).*\b' + re.escape(keyword) + r'\b'
The key here is that ^(?:[^']|\b).* will not match if the line starts with an apostrophe ('). The |\b is necessary otherwise the pattern would not match if the keyword was at the start of the line:
https://regex101.com/r/CUI2V3/1
Alternatively, an other option to solve the issue would be to remove all lines with comments from vba_code before running the regex.
Affected tool:
olevba and mraptor (maybe others as well that I haven't used)
Describe the bug
Suspicious keywords (eg. "create") in the comments are causing false positives
File/Malware sample to reproduce the bug
Sub test()
'I love to create
MsgBox "Hello world"
End Sub
This issue was already mentioned in https://github.com/decalage2/oletools/issues/90, but I think the problem deserves a specific issue.
Currently, for matching suspicious keywords, there is no attempt to distinguish a regular line of code from a comment:
eg.: https://github.com/decalage2/oletools/blob/168a92d7c53d972f499356bda7d3335c61710eec/oletools/olevba.py#L2201
I think there is a few ways we could avoid false positives related to comments, one of them would be to edit the pattern to look like this:
r'(?i)^(?:[^']|\b).*\b' + re.escape(keyword) + r'\b'
The key here is that
^(?:[^']|\b).*
will not match if the line starts with an apostrophe ('). The|\b
is necessary otherwise the pattern would not match if the keyword was at the start of the line: https://regex101.com/r/CUI2V3/1Alternatively, an other option to solve the issue would be to remove all lines with comments from
vba_code
before running the regex.Affected tool: olevba and mraptor (maybe others as well that I haven't used)
Describe the bug Suspicious keywords (eg. "create") in the comments are causing false positives
File/Malware sample to reproduce the bug
How To Reproduce the bug run olevba on the sample
Expected behavior No threat detected