CybercentreCanada / CCCS-Yara

YARA rule metadata specification and validation utility / Spรฉcification et validation pour les rรจgles YARA
MIT License
98 stars 19 forks source link

Validation Reject Valid Rules with Non-ASCII Characters #18

Closed malvidin closed 4 years ago

malvidin commented 4 years ago

Rules that contain non-ASCII characters in metadata are inappropriately rejected. YARA reads the rules in as bytes, and processes the metadata strings similarly to strings.

There are more issues with escaped values in metadata strings (\x00, \x80-\xFF) than with printable non-ASCII strings.

Recommend accepting non-ASCII characters in rules, modifying rules that contain values that can be unescaped (\000, \x80-\xFF, etc.), and performing a test compilation of the rules with yara-python. https://github.com/VirusTotal/yara/issues/1242

yr = yara.compile(source=r'rule test:tag {meta: emoji = "๐Ÿ‘" escaped = "\x7E" condition: true}')
... print(yr.match(data='')[0].meta)
{'emoji': '๐Ÿ‘', 'escaped': '~'}

yr = yara.compile(source=r'rule test:tag {meta: emoji = "๐Ÿ‘" escaped = "\x7F" condition: true}')
... print(yr.match(data='')[0].meta)
{'emoji': '๐Ÿ‘', 'escaped': '\x7f'}

yr = yara.compile(source=r'rule test:tag {meta: emoji = "๐Ÿ‘" escaped = "\x80" condition: true}')
... print(yr.match(data='')[0].meta)
Process finished with exit code -1073741819 (0xC0000005)
cccs-jp commented 4 years ago

Hello @malvidin ! Thanks for taking the time to report this.

We will start by enabling UTF-8 support and go form there!

Hope you find this library useful.

cccs-gm commented 4 years ago

Resolved in latest release: https://github.com/CybercentreCanada/CCCS-Yara/releases/tag/v1.3