VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
648 stars 179 forks source link

yara-python is incompatible with memoryview #147

Closed teskje closed 4 years ago

teskje commented 4 years ago

It is currently not possible to pass a memoryview to yara-python for matching rules. Consider this minimal example:

import yara

rules = yara.compile(source="""
rule dummy
{
    condition:
       false
}
""")
data = memoryview(b"asdf")
rules.match(data=data)

This fails:

Traceback (most recent call last):
  File "yara-memoryview.py", line 12, in <module>
    rules.match(data=data)
TypeError: argument 3 must be read-only bytes-like object, not memoryview

In this case, we have a memoryview of a bytes object, so it is actually both read-only and bytes-like, which makes the error message very confusing.

The root of the issue seems to be that yara-python uses the s# formatter for parsing the data argument. According to the documentation:

Some formats require a read-only bytes-like object, and set a pointer instead of a buffer structure. They work by checking that the object’s PyBufferProcs.bf_releasebuffer field is NULL, which disallows mutable objects such as bytearray.

memoryview does define a releasebuffer callback (regardless of mutability), which is why according to the above PyArg_ParseTuple heuristic it is not considered read-only. This is a known issue.

Short of fixing PyArg_ParseTuple's heuristic somehow, the only solution I can see is dropping the read-only requirement for the data buffer. I.e., replace the s# formatter with s*.