VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
646 stars 179 forks source link

Rules.match docs say data argument is a str, but also accepts bytes #209

Closed gebailey closed 2 years ago

gebailey commented 2 years ago

According to https://yara.readthedocs.io/en/latest/yarapython.html#yara.Rules :

The data argument to the match method in the yara.Rules class accepts a string. When trying to evaluate arbitrary files, I ran into UTF-8 decode() errors when trying to decode an image file, which obviously wouldn't be a string. I removed the method call to decode('utf-8') and instead just call the match method and pass a byte array for the data argument instead.

Is this the right way to scan arbitrary binary data? If so, should the documentation be updated to reflect that the data argument can be either a str or bytes parameter? My goal is to scan content stored in a variable that I've read from elsewhere (I'm not scanning a local file).

plusvic commented 2 years ago

Yes, you can use either str and bytes with the match function. This issue with the documentation comes from the time when Python 2 was still more widely-used than Python 3, and str in Python is basically the same thing than bytes. In Python 3 that's not true anymore.

I've fixed the issue in https://github.com/VirusTotal/yara/commit/1fe26dcf78e0c25d87b4d958519e1141ede8753a