Closed regeciovad closed 1 year ago
It looks like the test cases are failing due to some heap overflow detected with --enable-address-sanitizer
.
https://github.com/VirusTotal/yara/actions/runs/4927239541/jobs/8803939475?pr=1921
I am sorry for the late reply. The PR should be fixed now.
The goal was to create a deterministic way to detect potentially slow scanning due to a lower quality of rules. The first version tested the actual speed. However, other factors, such as CPU usage, could influence this. In this version, I was focusing more on indicators of the rules themselves.
The first indicator is where Yara is using 0-length atoms, basically testing input byte by byte. This problem is partially addressed by existing warnings about the low quality of atoms (aka famous slowing-down scanning). Still, due to the changing nature of heuristics for these calculations, it is sometimes hard to conclude this is the case. However, I did not want to generate a callback if the size of the scanned input is relatively small; thus, the effect of the slowing is not that significant. I tested how the slow rules behave on different sizes of inputs. The slowing was more notable when the files were bigger than 0.2 MB. For that reason, I am generating a callback just for files that are larger than that.
The second indicator is the number of potential matches. If the count is higher than one million, the ERROR_TOO_MANY_MATCHES is returned. However, even the lower bound can indicate that something is wrong. I tested some additional factors, but these two showed up as the simplest yet the most effective so far.
Example: