VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.13k stars 1.42k forks source link

A new too slow scanning callback #1921

Closed regeciovad closed 1 year ago

regeciovad commented 1 year ago

The goal was to create a deterministic way to detect potentially slow scanning due to a lower quality of rules. The first version tested the actual speed. However, other factors, such as CPU usage, could influence this. In this version, I was focusing more on indicators of the rules themselves.

The first indicator is where Yara is using 0-length atoms, basically testing input byte by byte. This problem is partially addressed by existing warnings about the low quality of atoms (aka famous slowing-down scanning). Still, due to the changing nature of heuristics for these calculations, it is sometimes hard to conclude this is the case. However, I did not want to generate a callback if the size of the scanned input is relatively small; thus, the effect of the slowing is not that significant. I tested how the slow rules behave on different sizes of inputs. The slowing was more notable when the files were bigger than 0.2 MB. For that reason, I am generating a callback just for files that are larger than that.

The second indicator is the number of potential matches. If the count is higher than one million, the ERROR_TOO_MANY_MATCHES is returned. However, even the lower bound can indicate that something is wrong. I tested some additional factors, but these two showed up as the simplest yet the most effective so far.

Example:

$ cat rule.yar
rule rule_com {
  strings:
    $com = /.{1,2}\.com/
  condition:
    $com
}
$ ./yara rule.yar top-1m.csv
warning: rule "rule_com": scanning with string $com is taking a very long time, it is either too general or very common.
rule_com top-1m.csv
plusvic commented 1 year ago

It looks like the test cases are failing due to some heap overflow detected with --enable-address-sanitizer.

https://github.com/VirusTotal/yara/actions/runs/4927239541/jobs/8803939475?pr=1921

regeciovad commented 1 year ago

I am sorry for the late reply. The PR should be fixed now.