need new syntax "#cond cardinality 3 > 2", which means that the number of matches where the condition cond is satisfied more than 3 times is greater than 2 #2117
1. The current YARA syntax only allows you to count the number of matches for each condition, but it cannot analyze the internal matching details.
For example, when writing a regular expression for a phone number, it can only represent the number of times a phone number is matched, but it cannot enforce the number of different phone numbers that were matched.
The following expression means that there are at least 20 phone numbers that appear more than 10 times, and no more than 50 phone numbers that appear more than 5 times.
rule example
{
strings:
$phone = /\d{11}/
condition:
#phone cardinality 10 > 20 and #phone cardinality 5 < 50
}
3. About performance.
Due to the limitations of the data structures in C, the implementation is relatively complex. Therefore, only the functionality of C is provided, and the task of calculating word frequencies [A and B] is handed over to the DLL caller to calculate.
A. Lazy calculation of match frequencies
B. No repeated frequency calculations
C. User calculates the frequencies
1. The current YARA syntax only allows you to count the number of matches for each condition, but it cannot analyze the internal matching details.
For example, when writing a regular expression for a phone number, it can only represent the number of times a phone number is matched, but it cannot enforce the number of different phone numbers that were matched.
2. We need to design a new syntax to solve it.
#cond cardinality {min_times} {op} {count_limit}
The following expression means that there are at least 20 phone numbers that appear more than 10 times, and no more than 50 phone numbers that appear more than 5 times.
3. About performance.
Due to the limitations of the data structures in C, the implementation is relatively complex. Therefore, only the functionality of C is provided, and the task of calculating word frequencies [A and B] is handed over to the DLL caller to calculate.
4. About implementation.