odin1314 / yara-project

Automatically exported from code.google.com/p/yara-project
Apache License 2.0
0 stars 0 forks source link

Defect #35

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Load the signatures in the attached .yara file
2.Run yara against suitable test files (see discussion)
3.wait (and wait and wait(

What is the expected output? What do you see instead?

Expected output is a list of matches.  Actual output is nothing, but 
performance meter shows 100%cpu until yara isntance is killed.

What version of the product are you using? On what operating system?
1.6 (latest) on OS X 10.6

Please provide any additional information below.

I need an easy way to determine the data file or files that are causing the 
problem.  I derived the signatures in question using the clamav to yara script  
(see http://resources.infosecinstitute.com/malware-analysis-clamav-yara/ )  My 
test data is a corpus of some 5000 emails with a total volume of 600MB.  I run 
yara on this directory using the -r option. Clamav finds no malware in this 
corpus and I am using it primarily for performance and timing data.  To ensure 
some output, I have a trivial signature that looks for messages with both html 
and jpeg attachments, there being about 20 of these scattered through the 
corpus.  If I use the trivial signature alone, I find those messages.  If I add 
any of the signatures in the attached file, yara hangs processing the corpus.  
In all but one case, it hangs prior to finding the first jpeg/html file, but 
this is several hundred files into the list.  In the remaining case, the first 
known file is identified prior to the hang.  For obvious reasons, I cannot 
share the corpus, but, if yara provided an option to list files as it processes 
them, I could pinpoint the problematic file for each signature and either 
provide the file or a sanitized version that causes the problem.  The 
signatures may be sufficient to help locate the problem as most seem to combine 
a data sensitive 2 byte prefix with a [0-255] jump.

Original issue reported on code.google.com by johnmchu...@gmail.com on 26 Jan 2012 at 7:12

Attachments:

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
These rules are an example of the worst case scenario for yara. The [0-255] 
jump is simply too wide, they are allowed but they aren't recommended because 
they are really slow. If the first part of the strings would be longer, 
probably that wouldn't be a problem, but these [0-255] jumps are preceded by 
just two or three bytes. Whenever yara finds a match for those 2-3 bytes (and 
this should happen a lot), it has to scan the following 255 bytes trying to 
match the remaining part of the string, that's what make the scanning so slow. 
The problem is even worse with the string containing two of those jumps with 
just b801000000 in between, wich his a a very common pattern in executable 
files because is the Intel's opcode for mov eax,1. So, the problem here is not 
with some particular file in your dataset, the problem is the rule. 

My recommendation in this case would be removing the first bytes of the strings 
including the jump, for example, the string 
5657[0-255]b801000000f7d0ffc081c00100000081e889402626 is better written as 
b801000000f7d0ffc081c00100000081e889402626. The first two bytes don't make a 
lot of difference in the signature and the remaining string is long enough to 
be significant. With shorter strings like 505f[0-255]505e81e800764000 this 
approach imposes a lot more risk.

Original comment by plus...@gmail.com on 26 Jan 2012 at 9:11

GoogleCodeExporter commented 9 years ago
After testing the given rule file with some files I've noticed that in fact it 
was taking even longer than I would expect because of the reasons discussed 
above. Actually the scan was never ending even with small files. So you were 
right, YARA was not simply taking long, it was completely hang. The issue has 
been fixed in r136.

Original comment by plus...@gmail.com on 31 Jan 2012 at 3:27

GoogleCodeExporter commented 9 years ago

Original comment by plus...@gmail.com on 31 Jan 2012 at 3:27