VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
646 stars 179 forks source link

Allow metadata to contain a list of values #201

Closed cccs-rs closed 2 years ago

cccs-rs commented 2 years ago

Derivation of: https://github.com/VirusTotal/yara-python/pull/74 😍

This allows the Match.meta values to becomes lists if there are rule meta with the same name but different values. This is also to bring output from the Python library more inline with the output from the commandline.

ie. Suppose there's a match given for the following rule:

rule myRule :
{
    meta:
              ...
        malware = "BAD THING"
        malware = "REALLY BAD THING"
         ...
}

The corresponding values in Match.meta['malware'] will be: ["BAD THING", "REALLY BAD THING"] instead of just "REALLY BAD THING"

google-cla[bot] commented 2 years ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

For more information, open the CLA check for this pull request.

plusvic commented 2 years ago

But this breaks backward compatibility, right? People that are already using Match.meta do not expect to receive an array when they do Match.meta['malware']. We should keep with the approach followed in #74, add a new function for those that want to retrieve metadata with multiple values.

cccs-rs commented 2 years ago

But this breaks backward compatibility, right? People that are already using Match.meta do not expect to receive an array when they do Match.meta['malware']. We should keep with the approach followed in #74, add a new function for those that want to retrieve metadata with multiple values.

Could a flag in Rules.match() control this behaviour, say via a param overwrite_meta_values? The flag will have the default state to maintain backwards compatibility, but for those that want this feature, they can change the flag.

Not sure if it's ideal to have a special function that outputs arrays for all metadata since it could just be a flag that changes the output from Dict[str, str] to Dict[str, List[str]].

plusvic commented 2 years ago

Yes, an argument allow_duplicate_metadata to Rules.match() that changes that behaviour should be fine. The default value would be False, which means that Match.meta will be a dictionary a Dict[key, value] as it is now. If the value is True , Match.meta will be a Dict[key, [value1, value2, ....]].

cccs-rs commented 2 years ago

bump?

cccs-rs commented 2 years ago

bump

cccs-rs commented 2 years ago

bump

cccs-rs commented 2 years ago

bump

cccs-rs commented 2 years ago

bump

cccs-rs commented 2 years ago

bump

cccs-rs commented 2 years ago

bump

plusvic commented 2 years ago

@cccs-rs, adding the same comment every week is not going to speed up things. It's ok to ping the maintainers from time to time, but not at this rate.

plusvic commented 2 years ago

I also miss a test case for this new feature.