log2timeline / plaso

Super timeline all the things
https://plaso.readthedocs.io
Apache License 2.0
1.73k stars 351 forks source link

l2t fails to parse yara rules file that yara parses OK #921

Closed ghost closed 8 years ago

ghost commented 8 years ago

1.5 RC1 from GIFT PPA on Ubuntu 14.04LTS 64 bit (vm, fresh install, all updates)

log2timeline is failing to parse a yara rules file (generated by clamav_to_yara.py) that yara parses OK.

(clamav_to_yara.py does generate one invalid yara rule from the current clamav main.ndb which I had to manually remove before yara would parse the file - so that rule isn't causing the problem)

# wget http://db.local.clamav.net/main.cvd
# sigtool -u main.cvd
# python clamav_to_yara.py -f main.ndb -o main.yara

(remove from main.yara yara rule yara refuses to parse - Win_Trojan_EOL_1)

# yara main.yara (somefile) .... succeeds
# log2timeline.py --yara_rules main.yara --debug --log-file winxp2yara.txt results/winxp2yara.plaso images/WinXP2.E01 
Traceback (most recent call last):
  File "/usr/bin/log2timeline.py", line 782, in <module>
    if not Main():
  File "/usr/bin/log2timeline.py", line 736, in Main
    if not tool.ParseArguments():
  File "/usr/bin/log2timeline.py", line 527, in ParseArguments
    self.ParseOptions(options)
  File "/usr/bin/log2timeline.py", line 549, in ParseOptions
    self._ParseExtractionOptions(options)
  File "/usr/lib/python2.7/dist-packages/plaso/cli/extraction_tool.py", line 101, in _ParseExtractionOptions
    yara_rules_path, exception))
plaso.lib.errors.BadConfigObject: Unable to parse Yara rules in: main.yara, error was: internal fatal error

No log file is generated.

ghost commented 8 years ago

The yara file generated by clamav_to_yara.py from current clamav signatures is just over one million lines long and 28 megabytes in size.

I carried out a quick 'binary search' of the file size. Fails on approx 500,000 lines. Works on approx 250,000 lines.

Suggests the problem arises from either file size or there is something between lines 250,000 to 500,000 that is causing a problem.

I would not have thought reading and parsing a 28mb file would be an issue even if it did have a million lines.

And, yes, I am trying to stress test yara support. :) But running the clamav sigs will be a common thing people will try...

joachimmetz commented 8 years ago

Thanks for the testing and the report, assigning to @Onager

Onager commented 8 years ago

I can't reproduce this - yara ran fine with the clam rules I downloaded a few minutes.

As the message suggests, this is an internal error in Yara, not in Plaso. I suspect there was some bad rule that caused Yara to crash.

ghost commented 8 years ago

How can there be a 'bad rule' if the standalone yara program parsed the rule file ok?

ghost commented 8 years ago

I have repeated this test on two different machines, one Ubuntu 14.04LTS and the other Ubuntu 16.04LTS. Same outcome:

Traceback (most recent call last):
  File "/usr/bin/log2timeline.py", line 782, in <module>
    if not Main():
  File "/usr/bin/log2timeline.py", line 736, in Main
    if not tool.ParseArguments():
  File "/usr/bin/log2timeline.py", line 527, in ParseArguments
    self.ParseOptions(options)
  File "/usr/bin/log2timeline.py", line 549, in ParseOptions
    self._ParseExtractionOptions(options)
  File "/usr/lib/python2.7/dist-packages/plaso/cli/extraction_tool.py", line 101, in _ParseExtractionOptions
    yara_rules_path, exception))
plaso.lib.errors.BadConfigObject: Unable to parse Yara rules in: main.yara, error was: internal fatal error

This is the yara file that caused the crash:

main.yara.gz

joachimmetz commented 8 years ago

Per chat with @Onager this might be caused by memory constraints.

joachimmetz commented 8 years ago

With yara-python 3.4.0 I have the same issue with the file main.yara.gz. As you can test by running:

import yara
f = open("main.yara")
b = f.read()
yara.compile(source=b)

This is a yara specific issue. Since there is no plaso code in the snippet.

joachimmetz commented 8 years ago

Same issue with yara-python 3.5.0, both 64-bit compiles

https://github.com/VirusTotal/yara/blob/a5f86fd365ac85e9af4e66b371b6b20117986e57/libyara/compiler.c#L948

Maybe related issue: https://github.com/VirusTotal/yara/issues/492

ghost commented 8 years ago

I confirm that I get the same failure on main.yara with this snippet:

Type "help", "copyright", "credits" or "license" for more information.
>>> import yara
>>> f = open("main.yara")
>>> b = f.read()
>>> yara.compile(source=b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
yara.SyntaxError: internal fatal error

However if you invoke yara.compile directly from the file rather than reading the rules in as a string you still get a failure, but this time yara gives you a much more helpful error message that guides you directly to the problem:

>>> import yara
>>> yara.compile("main.yara")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
yara.SyntaxError: main.yara(1008531): internal fatal error

At line 1008531 of main.yara:


rule Pdf_Exploit_Agent_35529
{
strings:
    $a0 = { 4a4249472333324465636f6465[0-100]73747265616d0d0a????????(40|41|42|43|44|45|46|47|48|49|4a|4b|4c|4d|4e|4f)(31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|15|14|13|12|11|10|09|08|07|06|05|04|03|02|01|00)??(ba|bb|bc|bd|be|bf|c0|c1|c2|c3|c4|c5|c6|c7|c8|c9|ca|cb|cc|cd|ce|cf|d0|d1|d2|d3|d4|d5|d6|d7|d8|d9|da|db|dc|dd|de|df|e0|e1|e2|e3|e4|e5|e6|e7|e8|e9|ea|eb|ec|ed|ee|ef|f0|f1|f2|f3|f4|f5|f6|f7|f8|f9|fa|fb|fc|fd|fe|ff) }

condition:
    $a0
}

Which confirms that the core issue is the same as https://github.com/VirusTotal/yara/issues/492 .

Suggest this plaso issue be re-opened, changed from bug to enhancement - modify yara compilation in plaso to read direct from file so as to give more useful error messages.

joachimmetz commented 8 years ago

Still a very cryptic error message for most users. I'll have a chat with Victor from VirusTotal to see if the error reporting can be improved for both cases.

ghost commented 8 years ago

FTR, my tests running clamav (main + daily) -> yara rules (with several rules that cause yara to crash removed) seem successful.