Closed ghost closed 8 years ago
The yara file generated by clamav_to_yara.py from current clamav signatures is just over one million lines long and 28 megabytes in size.
I carried out a quick 'binary search' of the file size. Fails on approx 500,000 lines. Works on approx 250,000 lines.
Suggests the problem arises from either file size or there is something between lines 250,000 to 500,000 that is causing a problem.
I would not have thought reading and parsing a 28mb file would be an issue even if it did have a million lines.
And, yes, I am trying to stress test yara support. :) But running the clamav sigs will be a common thing people will try...
Thanks for the testing and the report, assigning to @Onager
I can't reproduce this - yara ran fine with the clam rules I downloaded a few minutes.
As the message suggests, this is an internal error in Yara, not in Plaso. I suspect there was some bad rule that caused Yara to crash.
How can there be a 'bad rule' if the standalone yara program parsed the rule file ok?
I have repeated this test on two different machines, one Ubuntu 14.04LTS and the other Ubuntu 16.04LTS. Same outcome:
Traceback (most recent call last):
File "/usr/bin/log2timeline.py", line 782, in <module>
if not Main():
File "/usr/bin/log2timeline.py", line 736, in Main
if not tool.ParseArguments():
File "/usr/bin/log2timeline.py", line 527, in ParseArguments
self.ParseOptions(options)
File "/usr/bin/log2timeline.py", line 549, in ParseOptions
self._ParseExtractionOptions(options)
File "/usr/lib/python2.7/dist-packages/plaso/cli/extraction_tool.py", line 101, in _ParseExtractionOptions
yara_rules_path, exception))
plaso.lib.errors.BadConfigObject: Unable to parse Yara rules in: main.yara, error was: internal fatal error
This is the yara file that caused the crash:
Per chat with @Onager this might be caused by memory constraints.
With yara-python 3.4.0 I have the same issue with the file main.yara.gz. As you can test by running:
import yara
f = open("main.yara")
b = f.read()
yara.compile(source=b)
This is a yara specific issue. Since there is no plaso code in the snippet.
Same issue with yara-python 3.5.0, both 64-bit compiles
Maybe related issue: https://github.com/VirusTotal/yara/issues/492
I confirm that I get the same failure on main.yara with this snippet:
Type "help", "copyright", "credits" or "license" for more information.
>>> import yara
>>> f = open("main.yara")
>>> b = f.read()
>>> yara.compile(source=b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
yara.SyntaxError: internal fatal error
However if you invoke yara.compile directly from the file rather than reading the rules in as a string you still get a failure, but this time yara gives you a much more helpful error message that guides you directly to the problem:
>>> import yara
>>> yara.compile("main.yara")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
yara.SyntaxError: main.yara(1008531): internal fatal error
At line 1008531 of main.yara:
rule Pdf_Exploit_Agent_35529
{
strings:
$a0 = { 4a4249472333324465636f6465[0-100]73747265616d0d0a????????(40|41|42|43|44|45|46|47|48|49|4a|4b|4c|4d|4e|4f)(31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|15|14|13|12|11|10|09|08|07|06|05|04|03|02|01|00)??(ba|bb|bc|bd|be|bf|c0|c1|c2|c3|c4|c5|c6|c7|c8|c9|ca|cb|cc|cd|ce|cf|d0|d1|d2|d3|d4|d5|d6|d7|d8|d9|da|db|dc|dd|de|df|e0|e1|e2|e3|e4|e5|e6|e7|e8|e9|ea|eb|ec|ed|ee|ef|f0|f1|f2|f3|f4|f5|f6|f7|f8|f9|fa|fb|fc|fd|fe|ff) }
condition:
$a0
}
Which confirms that the core issue is the same as https://github.com/VirusTotal/yara/issues/492 .
Suggest this plaso issue be re-opened, changed from bug to enhancement - modify yara compilation in plaso to read direct from file so as to give more useful error messages.
Still a very cryptic error message for most users. I'll have a chat with Victor from VirusTotal to see if the error reporting can be improved for both cases.
FTR, my tests running clamav (main + daily) -> yara rules (with several rules that cause yara to crash removed) seem successful.
1.5 RC1 from GIFT PPA on Ubuntu 14.04LTS 64 bit (vm, fresh install, all updates)
log2timeline is failing to parse a yara rules file (generated by clamav_to_yara.py) that yara parses OK.
(clamav_to_yara.py does generate one invalid yara rule from the current clamav main.ndb which I had to manually remove before yara would parse the file - so that rule isn't causing the problem)
No log file is generated.