[Fall 2021] Step 3: Compare Pysa coverage with CodeQL coverage

r0rshark commented 3 years ago

CodeQL is a static analysis tool currently developed by Github, it also support Python. We should make sure Pysa has all the vulnerabilities and models defined in CodeQL. Example Server Side Template Injection https://github.com/github/securitylab/issues/93 . We can also try to run on some sample projects to understand if there are things codeql is able to find but Pysa isn't

abishekvashok commented 3 years ago

I can see CodeQL has support for the following:

CVE-2018-1281 (Bind To All Interfaces) (Not detected) Possible way to tackle: Annote bind() as a sink for user controlled data (would never happen in almost all situations)
CWE-020-ExternalAPIs (External APIs are marked as sinks and data from them as taint sources) (not detected) (Do we need such verbosity?)
CWE-020 (Regex serialisation, not definitely a vulnerability, but can be) (not detected) Possible way to tackle: annote pattern.regexpMatch and other regex functions as taint sinks for user controlled data
CWE-022 (File system read, write, and tar slip) (tar slip not detected) Possible way to tackle: Add models for tar slip? Idk how Pysa detects vulnerabilities such as stored xss. If we do, the same model can be used
CWE-078 (Command Line Injection) (Detected)
CWE-079 (plain XSS and XSS for jinja2) (XSS for jinja2 not detected) Possible way to tackle: Tricky in sense that it only occurs when user controlled parameter flows to template called via an environment in which environment has auto_escape turned off (default is off).
CWE-089 (SQL Injection) (Detected) Possible way to improve: Add coverage for aiopg and cx_oracle (closed because we thought we need stubs, we can have another go by defining them as site packages). Checked, getting typing.any for aiopg and cx_oracle is written in Cpython.
CWE-094 (Code Injection) (Detected)
CWE-209 (Stack Trace or Exception exposure) (Detected)
CWE-215 (Running flask app in debug mode) (Can Pysa detect this? I am not sure..)
CWE-295 (Missing host key validation) (Not detected) Possible way to tackle: Solvable for paramiko library by tainting AutoAddPolicy as source and set_missing_host_key_policy as sink. (We have support for paramiko but not coverage for this vulnerability)
CWE-312 (Putting unhashed values in cookies) (Not detected) Possible way to tackle: One way would be to prevent user controlled parameters to reach cookies and add popular hashing functions as features. But the question is, is this necessarily great in impact that it should be added?
CWE-326 (Weak encryption keys) (Pysa can't but how can we make this hit as an issue when we use implicit approach is a question)
CWE-327 (Insecure protocols) (Pysa can't but can be made to with implicit sources?)
CWE-377 (Insecure temporary files) (Pysa can't)
CWE-502 (unsafe deserialisations) (detected)
CWE-601 (Url redirection) (Detected for most popular libraries)
CWE-732 (Weak file permissions) (Pysa can't now, but coding implicit sources would fix this)
CWE-798 (Hardcoded credentials) (Pysa can with implicit sinks for common apis)

abishekvashok commented 3 years ago

@r0rshark this is my preliminary analysis after checking the python security folder of CodeQL. Curious to know what you think and maybe, we can create issue for each issue we can possibly detect with Pysa

r0rshark commented 3 years ago

Nice work! I'll spend some time tomorrow going through the list and defining what we can do :)

MLH-Fellowship / pyre-check

[Fall 2021] Step 3: Compare Pysa coverage with CodeQL coverage #43