VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
637 stars 178 forks source link

yara-python cannot scan chinese filename #241

Closed qux-bbb closed 7 months ago

qux-bbb commented 10 months ago

Just like this link says, yara-python cannot also scan chinese filename:
https://github.com/VirusTotal/yara/issues/1487

The script:

import yara

yara_rule_path = 'hello.yar'
rules = yara.compile(filepath=yara_rule_path)

sample_path = '你好.txt'
matches = rules.match(sample_path)
print(sample_path, matches)

The error:

Traceback (most recent call last):
  File "d:/recent/tmp/test.py", line 7, in <module>
    matches = rules.match(sample_path)
yara.Error: could not open file "你好.txt"
plusvic commented 7 months ago

This issue is similar to #245.

This is a well known issue that won't be solved anytime soon. But there's an alternative, instead of passing filepath to rules.match, read the file from Python and pass data to rules.match. This way Python handles the file reading, which should handle unicode path correctly. The problem with passing the file path directly to YARA is that YARA's API doesn't offer unicode support.