VirusTotal / yara

The pattern matching swiss knife
https://virustotal.github.io/yara/
BSD 3-Clause "New" or "Revised" License
8.08k stars 1.43k forks source link

Syntax improvement for count of string set #1966

Open captainGeech42 opened 11 months ago

captainGeech42 commented 11 months ago

Is your feature request related to a problem? Please describe. When trying to get a count of a set of strings, there is no obvious syntax for this. Normally, it would be represented like this:

strings:
    $s1 = "hello"
    $s2 = "world"
condition:
    (#s1 + #s2) > 5

However, when you have a large number of strings you want to get the string set count sum of, it becomes pretty unwieldy.

A partial workaround is to use this syntax: for 5 of ($s*): (#), which isn't super intuitive but does work.

Describe the solution you'd like A more clean and obvious syntax for this would be beneficial. An obvious candidate is (#s*) > 5. This would also enable more expressive boolean logic on the count, rather than just >=.

Describe alternatives you've considered n/a

Additional context n/a

plusvic commented 11 months ago

Notice that the workaround for 5 of ($s*): (#) relies on an implementation detail and is not guaranteed to work in the future. For more details see this: https://github.com/VirusTotal/yara/discussions/1781#discussioncomment-4261202

Also, as mentioned in the discussion in the link above, (#s*) > 5 can lead to confusion, and be interpreted as "make sure that the number of occurrences for every pattern starting with $s is larger than 5".

It looks that there's some demand for this feature, we need to find the more appropriate way for expressing this .