CybercentreCanada / assemblyline

AssemblyLine 4: File triage and malware analysis
https://cybercentrecanada.github.io/assemblyline4_docs/
MIT License
247 stars 15 forks source link

Python file identified as text/plain #254

Closed kam193 closed 2 months ago

kam193 commented 2 months ago

Describe the bug The following file - do_snapshot.py.zip (pass: zippy, potentially dangerous code) - has been identified as text/plain instead of code/python, after being extracted from an archive.

It contains one of quite typical obfuscation methods, which I belive should be detected: _ = lambda __ : __import__('zlib').decompress(__import__('base64').b64decode(__[::-1]));exec((_)

To Reproduce Steps to reproduce the behavior:

  1. Upload the file
  2. Observe wrong type

Expected behavior File identified as code/python

Screenshots

Environment (please complete the following information if pertinent):

Additional context

BTW, I was recently thinking, if it wouldn't be wise to fall back on the file extension in cases like this (when the normal file identification process didn't produce any meaningful result, but the extension gives a hint). I don't think we would lose anything if files identified as text/plain got a type assignment based on the extension (maybe behind a config flag?). Although I'm not sure if the extension is preserved e.g. on resubmit.

gdesmar commented 2 months ago

That last patch should fix the lambda detection. It should be part of next release! See you for the next tricky python identification evasion! 😃

kam193 commented 2 months ago

Thanks again, see you next time :D

cccs-rs commented 2 months ago

Patch should be included in 4.5.0.44 release