Comments are causing false positives

This issue was already mentioned in https://github.com/decalage2/oletools/issues/90, but I think the problem deserves a specific issue.

Currently, for matching suspicious keywords, there is no attempt to distinguish a regular line of code from a comment:

eg.: https://github.com/decalage2/oletools/blob/168a92d7c53d972f499356bda7d3335c61710eec/oletools/olevba.py#L2201

I think there is a few ways we could avoid false positives related to comments, one of them would be to edit the pattern to look like this: r'(?i)^(?:[^']|\b).*\b' + re.escape(keyword) + r'\b'

The key here is that ^(?:[^']|\b).* will not match if the line starts with an apostrophe ('). The |\b is necessary otherwise the pattern would not match if the keyword was at the start of the line: https://regex101.com/r/CUI2V3/1

Alternatively, an other option to solve the issue would be to remove all lines with comments from vba_code before running the regex.

Affected tool: olevba and mraptor (maybe others as well that I haven't used)

Describe the bug Suspicious keywords (eg. "create") in the comments are causing false positives

File/Malware sample to reproduce the bug

Sub test()
    'I love to create
    MsgBox "Hello world"
End Sub

How To Reproduce the bug run olevba on the sample

Expected behavior No threat detected

decalage2 / oletools

Comments are causing false positives #817