facebook / pyre-check

Performant type-checking for python.
https://pyre-check.org/
MIT License
6.81k stars 434 forks source link

Pyre returns different results with the same input for sqlite3 (all SQL injections found) and mysql.connector (just 1 injection found out of 3) #411

Closed maximmasiutin closed 3 years ago

maximmasiutin commented 3 years ago

I have created a taint sink file for mysql.connector (mysql_connector_sinks.pysa) and a sample application (unsafe-mysql.py), with 3 subsequent lines, each of them sends an SQL-injectable query to the server. However, pyre only detects the last string as vulnerable, and if I comment it, it detects none. However, if I send the same strings to sqlite3 (using sqlite3_sinks.pisa that ships with pyre-check), it detects all three. I have attached all the files to reproduce the issue. How can I debug pyre to figure out why I obtain different results with the same input for sqlite3 and mysql.connector? Can you help? Or is there a Pyre forum or user discussion group where I may ask this question?

mysql-connector-sinks.zip

I have the following files in the "taint_models_path" directory:

collection_propagation.pysa
filesystem_other_sinks.pysa
filesystem_sinks.pysa
format_string_sinks.pysa
general.pysa
http_server.pysa
input.pysa
logging_sinks.pysa
mysql_connector_sinks.pysa
protocols.pysa
rce_sinks.pysa
requests_api_sinks.pysa
sanitizers.pysa
skipped_overrides.pysa
sqlite3_sinks.pysa
taint.config
xss_sinks.pysa

P.S. Fixed in https://github.com/facebook/pyre-check/pull/414

onionymous commented 3 years ago

Hello @maximmasiutin! The reason you can't detect issues with your taint models for mysql.connector is because your project does not have type information for the mysql module. Our documentation at https://pyre-check.org/docs/pysa-false-negatives#missing-type-information describes how to debug these false negatives. Adding reveal_taint() and reveal_type() calls to your code makes this clear.

For unsafe-sqlite.py, this is the output:

2021-04-19 11:06:42,980 [PID 16629] WARNING unsafe-sqlite:5:0-5:11: Revealed type for db: sqlite3.dbapi2.Connection
2021-04-19 11:06:42,980 [PID 16629] WARNING unsafe-sqlite:7:0-7:11: Revealed type for cur: sqlite3.dbapi2.Cursor
2021-04-19 11:06:42,982 [PID 16629] WARNING unsafe-sqlite:10:0-10:12: Revealed forward taint for `s1`: Origin(unsafe-sqlite:8:7-8:46) -> UserControlled -> FlowDetails(SimpleFeature: [(SimpleVia "string_concat_lhs"), (SimpleVia "string_concat_rhs"), Tito], ComplexFeature: [], TraceLength: 0, TitoPosition: [9:5-9:38, 9:34-9:38], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,982 [PID 16629] WARNING
2021-04-19 11:06:42,982 [PID 16629] WARNING unsafe-sqlite:13:0-13:12: Revealed forward taint for `s2`: Origin(unsafe-sqlite:8:7-8:46) -> UserControlled -> FlowDetails(SimpleFeature: [Obscure], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,983 [PID 16629] WARNING
2021-04-19 11:06:42,983 [PID 16629] WARNING unsafe-sqlite:16:0-16:12: Revealed forward taint for `s3`: Origin(unsafe-sqlite:8:7-8:46) -> UserControlled -> FlowDetails(SimpleFeature: [(SimpleVia "string_concat_lhs"), (SimpleVia "string_concat_rhs"), Tito], ComplexFeature: [], TraceLength: 0, TitoPosition: [15:5-15:46, 15:42-15:46], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,983 [PID 16629] WARNING
2021-04-19 11:06:42,983 [PID 16629] WARNING unsafe-sqlite:19:0-19:12: Revealed forward taint for `s4`: Origin(unsafe-sqlite:8:7-8:46) -> UserControlled -> FlowDetails(SimpleFeature: [FormatString], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,983 [PID 16629] WARNING
2021-04-19 11:06:42,984 [PID 16629] WARNING unsafe-sqlite:19:0-19:16: Revealed backward taint for `s4`: Origin(unsafe-sqlite:20:12-20:14) -> SQL -> FlowDetails(SimpleFeature: [], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(sqlite3.dbapi2.Cursor.execute)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,984 [PID 16629] WARNING
2021-04-19 11:06:42,984 [PID 16629] WARNING unsafe-sqlite:16:0-16:16: Revealed backward taint for `s3`: Origin(unsafe-sqlite:17:12-17:14) -> SQL -> FlowDetails(SimpleFeature: [], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(sqlite3.dbapi2.Cursor.execute)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,985 [PID 16629] WARNING
2021-04-19 11:06:42,985 [PID 16629] WARNING unsafe-sqlite:13:0-13:16: Revealed backward taint for `s2`: Origin(unsafe-sqlite:14:12-14:14) -> SQL -> FlowDetails(SimpleFeature: [], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(sqlite3.dbapi2.Cursor.execute)], FirstIndex: [], FirstField: [])
2021-04-19 11:06:42,985 [PID 16629] WARNING
2021-04-19 11:06:42,985 [PID 16629] WARNING unsafe-sqlite:10:0-10:16: Revealed backward taint for `s1`: Origin(unsafe-sqlite:11:11-11:13) -> SQL -> FlowDetails(SimpleFeature: [], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(sqlite3.dbapi2.Connection.execute)], FirstIndex: [], FirstField: [])

We can see that cur is correctly typed as sqlite3.dbapi2.Cursor, so cur.execute() is properly recognized as a taint sink.

However, for unsafe-mysql.py:

2021-04-19 10:54:24,997 [PID 13981] WARNING unsafe-mysql:4:0-4:11: Revealed type for db: typing.Any
2021-04-19 10:54:24,998 [PID 13981] WARNING unsafe-mysql:6:0-6:11: Revealed type for cur: typing.Any
2021-04-19 10:54:24,998 [PID 13981] WARNING unsafe-mysql:9:0-9:12: Revealed forward taint for `s1`: Origin(unsafe-mysql:7:7-7:46) -> UserControlled -> FlowDetails(SimpleFeature: [Obscure], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 10:54:24,998 [PID 13981] WARNING
2021-04-19 10:54:24,998 [PID 13981] WARNING unsafe-mysql:12:0-12:12: Revealed forward taint for `s2`: Origin(unsafe-mysql:7:7-7:46) -> UserControlled -> FlowDetails(SimpleFeature: [(SimpleVia "string_concat_lhs"), (SimpleVia "string_concat_rhs"), Tito], ComplexFeature: [], TraceLength: 0, TitoPosition: [11:5-11:46, 11:42-11:46], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 10:54:24,998 [PID 13981] WARNING
2021-04-19 10:54:24,999 [PID 13981] WARNING unsafe-mysql:15:0-15:12: Revealed forward taint for `s3`: Origin(unsafe-mysql:7:7-7:46) -> UserControlled -> FlowDetails(SimpleFeature: [FormatString], ComplexFeature: [], TraceLength: 0, TitoPosition: [], LeafName: [LeafName(input)], FirstIndex: [], FirstField: [])
2021-04-19 10:54:24,999 [PID 13981] WARNING
2021-04-19 10:54:24,999 [PID 13981] WARNING unsafe-mysql:15:0-15:16: Revealed backward taint for `s3`:
2021-04-19 10:54:24,999 [PID 13981] WARNING
2021-04-19 10:54:24,999 [PID 13981] WARNING unsafe-mysql:12:0-12:16: Revealed backward taint for `s2`:
2021-04-19 10:54:24,999 [PID 13981] WARNING
2021-04-19 10:54:24,999 [PID 13981] WARNING unsafe-mysql:9:0-9:16: Revealed backward taint for `s1`:
2021-04-19 10:54:24,999 [PID 13981] WARNING

Here cur is of type typing.Any. Pysa won't understand that cur.execute() refers to the sink mysql.connector.cursor.MySQLCursor.execute. This is why the backward taint is not being propagated to the strings (e.g. s1).

Pysa should not be detecting any flows of [UserControlled] -> [SQL] for your mysql sinks. As for why the 1 issue in unsafe-mysql.py is detected, it's not actually a flow of [UserControlled] -> [SQL], and is in fact UserControlled going to a different sink type (StringMayBeSQL). We have implicit sinks for raw strings that look like SQL, using a regex-based rule, and we have a different rule detecting user-controlled input going into these strings. Currently we detect format strings, but not older percent-sign formatting and string concatenation, so that's why only one result showed up for [UserControlled] -> [StringMayBeSQL] rule. image

To resolve this, you will most likely need to add type stubs for the mysql library to your project. You may even consider sending a PR to python/typeshed adding type stubs (from https://github.com/python/typeshed/issues/146a it looks like they don't have any stubs for mysql).

onionymous commented 3 years ago

Closing since the question has been answered, but feel free to reopen if you have any other concerns or questions!

maximmasiutin commented 3 years ago

Thank you very much for your great and timely reply! You've made my day! I have been able to find vulnerabilities with Pyre in simple programs like https://github.com/fportantier/vulpy that have straightforward execution flow, but not yet with Django projects like the following: https://github.com/anxolerd/dvpwa https://github.com/nVisium/django.nV https://github.com/andresriancho/django-moth https://github.com/red-and-black/DjangoGoat https://github.com/Contrast-Security-OSS/DjanGoat

maximmasiutin commented 3 years ago

I have created the .pysa and .pyi files for the mysql.connector library, so now Pyre is able to find all three vulnerabilities.

Please take a look into https://github.com/facebook/pyre-check/pull/414