airbnb / binaryalert

BinaryAlert: Serverless, Real-time & Retroactive Malware Detection.
https://binaryalert.io
Apache License 2.0
1.41k stars 187 forks source link

yara.Error: could not map file into memory #49

Closed austinbyers closed 7 years ago

austinbyers commented 7 years ago

Some users are seeing the following error in the analyzer Lambda logs:

could not map file "/tmp/binaryalert_UUID" into memory: Error
Traceback (most recent call last):
File "/var/task/main.py", line 76, in analyze_lambda_handler
with binary_info.BinaryInfo(os.environ['S3_BUCKET_NAME'], s3_key, ANALYZER) as binary:
File "/var/task/binary_info.py", line 57, in __enter__
self.download_path, original_target_path=self.observed_path)
File "/var/task/yara_analyzer.py", line 52, in analyze
return self._rules.match(target_file, externals=self._yara_variables(original_target_path))
yara.Error: could not map file "/tmp/binaryalert_UUID" into memory

I have not been able to reproduce this locally, even with 20,000 YARA rules scanning a 10G file. Some theories:

austinbyers commented 7 years ago

I've confirmed that this error can happen even with the max Lambda memory allocation (1.5 GB) and with any size input file. Perhaps the number / size of the YARA rules are to blame?

austinbyers commented 7 years ago

I think I've tracked it down: a recent commit to Neo23x0/signature-base adds a new rule which includes a pe.imphash condition.

The YARA rules successfully compile and load in Lambda, but they fail with the memory mapping error when matching against most Windows binaries. My best guess is that this fails because of #30 (hash module not yet supported in BinaryAlert)

So the solution for now is to disable all rules which use pe.imphash. I will add a check to enforce this with unit tests since it is so hard to debug.

austinbyers commented 7 years ago

An easy way to disable the relevant rules files is to rename rules_file.yar to rules_file.yar.DISABLED. BinaryAlert only includes files ending in .yar or .yara, so these files will be excluded from the next deploy.