VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.13k stars 1.42k forks source link

feat: skip bytecode evaluation for some rules without string matches #1927

Closed secDre4mer closed 1 year ago

secDre4mer commented 1 year ago

Optimize a common case where YARA conditions are formed like e.g. "... and 1 of them and ...", in other words, requiring a string match to ever be true. By noting these cases and recording in a bitmap if a string match occurred, the condition evaluation for these rules can be skipped entirely in most cases.

secDre4mer commented 1 year ago

In case it's relevant, some background information that led to this PR: We use ~20000 YARA rules in a ruleset and noticed that in such a large ruleset, condition evaluation takes up a serious amount of CPU time (~50-70% of total YARA scan time). This PR tries to reduce the overhead created by conditions (in our case, ~90% of all rules fall into the "only evaluate if a string matches" category introduced by the PR). Initial timings look good, with condition evaluation time dropping by ~65% in our case.

Some additional optimizations (like: tracking which strings actually have to match for the condition to possibly be true, tracking the number of string matches, ...) could be implemented if considered worthwhile.