Closed GoogleCodeExporter closed 9 years ago
[deleted comment]
These rules are an example of the worst case scenario for yara. The [0-255]
jump is simply too wide, they are allowed but they aren't recommended because
they are really slow. If the first part of the strings would be longer,
probably that wouldn't be a problem, but these [0-255] jumps are preceded by
just two or three bytes. Whenever yara finds a match for those 2-3 bytes (and
this should happen a lot), it has to scan the following 255 bytes trying to
match the remaining part of the string, that's what make the scanning so slow.
The problem is even worse with the string containing two of those jumps with
just b801000000 in between, wich his a a very common pattern in executable
files because is the Intel's opcode for mov eax,1. So, the problem here is not
with some particular file in your dataset, the problem is the rule.
My recommendation in this case would be removing the first bytes of the strings
including the jump, for example, the string
5657[0-255]b801000000f7d0ffc081c00100000081e889402626 is better written as
b801000000f7d0ffc081c00100000081e889402626. The first two bytes don't make a
lot of difference in the signature and the remaining string is long enough to
be significant. With shorter strings like 505f[0-255]505e81e800764000 this
approach imposes a lot more risk.
Original comment by plus...@gmail.com
on 26 Jan 2012 at 9:11
After testing the given rule file with some files I've noticed that in fact it
was taking even longer than I would expect because of the reasons discussed
above. Actually the scan was never ending even with small files. So you were
right, YARA was not simply taking long, it was completely hang. The issue has
been fixed in r136.
Original comment by plus...@gmail.com
on 31 Jan 2012 at 3:27
Original comment by plus...@gmail.com
on 31 Jan 2012 at 3:27
Original issue reported on code.google.com by
johnmchu...@gmail.com
on 26 Jan 2012 at 7:12Attachments: