VirusTotal / yara-python

The Python interface for YARA
http://virustotal.github.io/yara/
Apache License 2.0
646 stars 179 forks source link

Windows: cannot scan files with RLO in their filename #176

Open mat-gas opened 3 years ago

mat-gas commented 3 years ago

On Windows, yara-python cannot scan files with RLO / RTLO (Right To Left Override) in their filename

it fails with could not open file : xxxx

mini reproducer (it copies cmd.exe to a temp dir with a RLO in the filename, and tries to scan from here)

EDIT: also tried with russian characters, it fails too

EDIT2: might be linked to those too :

EDIT3: might be fixed with this PR? : https://github.com/VirusTotal/yara/pull/1491/files

import tempfile
import shutil
import os
import yara

rule = """
rule cmd_check
{
     strings:
         $cmd = "Windows Command Processor" wide
     condition:
         $cmd
}
 """

source = r'c:\windows\system32\cmd.exe'
new_filename = 'cmd_with_RLO\u202efdp.exe'
new_filename2 = 'испытание'

tempdir = tempfile.mkdtemp()
destination = os.path.join(tempdir, new_filename)
shutil.copy(source, destination)
destination2 = os.path.join(tempdir, new_filename2)
shutil.copy(source, destination2)

print("[*] File {} copied to directory {}".format(source, tempdir))

compiled = yara.compile(sources={'myrule':rule})

print("[*] scanning legit cmd.exe")
try:
    r =  compiled.match(source)
    print("  [+] scan successful")
except Exception as e:
    print("  [-] scan failed : {}".format(e))
print("[*] scanning cmd with RLO")
try:
    r = compiled.match(destination)
    print("  [+] scan successful")
except Exception as e:
    print("  [-] scan failed : {}".format(e))

print("[*] retrying with data only")
data = open(destination, "rb").read()
try:
    r = compiled.match(data=data)
    print("  [+] scan successful")
except Exception as e:
    print("  [-] scan failed : {}".format(e))

print("[*] scanning cmd with russian")
try:
    r = compiled.match(destination2)
    print("  [+] scan successful")
except Exception as e:
    print("  [-] scan failed : {}".format(e))

print("[*] retrying with data only")
data = open(destination2, "rb").read()
try:
    r = compiled.match(data=data)
    print("  [+] scan successful")
except Exception as e:
    print("  [-] scan failed : {}".format(e))

shutil.rmtree(tempdir)

output :

[*] File c:\windows\system32\cmd.exe copied to directory C:\Users\user\AppData\Local\Temp\tmplxvtlrya
[*] scanning legit cmd.exe
  [+] scan successful
[*] scanning cmd with RLO
  [-] scan failed : could not open file "C:\Users\user\AppData\Local\Temp\tmplxvtlrya\cmd_with_RLO‮fdp.exe"
[*] retrying with data only
  [+] scan successful
[*] scanning cmd with russian
  [-] scan failed : could not open file "C:\Users\user\AppData\Local\Temp\tmplxvtlrya\испытание"
[*] retrying with data only
  [+] scan successful

I didn't investigate much but it might come from here in yara lib:

https://github.com/VirusTotal/yara/blob/e1360f6cbe3d8daf350018661bc6772bd5b726f2/libyara/filemap.c#L280

EDIT: or maybe from here:

https://github.com/VirusTotal/yara-python/blob/master/yara-python.c#L1488

static PyObject* Rules_match(
    PyObject* self,
    PyObject* args,
    PyObject* keywords)
{
....
  char* filepath = NULL;
  Py_buffer data = {0};
....

  if (PyArg_ParseTupleAndKeywords(
        args,
        keywords,
        "|sis*OOOiOOiO",
        kwlist,
        &filepath,
        &pid,

https://docs.python.org/3/c-api/arg.html

If you want to accept filesystem paths and convert them to C character strings, it is preferable to use the O& format with PyUnicode_FSConverter() as converter.

image