Avoid possible SQL injection by refactoring string-based query construction

hussein-awala commented 1 year ago

Body

Some of our queries are string based, and they are passed directly to sqlalchemy session.execute(). To avoid SQL injection, we can profit from sqlalchemy by rewriting the queries bind parameters syntax or the select API.

[ ] Airflow Core - migration
[ ] Airflow Core - utils
[ ] Airflow Providers - Amazon
[ ] Airflow Providers - Apache.Cassandra
[ ] Airflow Providers - Apache.Hive
[ ] Airflow Providers - common.sql
[ ] Airflow Providers - Databricks
[ ] Airflow Providers - Google
[ ] Airflow Providers - MySQL
[ ] Airflow Providers - Oracle
[ ] Airflow Providers - Postgres
[ ] Airflow Providers - SalesForce

Committer

[X] I acknowledge that I am a maintainer/committer of the Apache Airflow project.

Taragolis commented 1 year ago

Just interesting how much of them actually could classified as SQL Injection? I mean have public API to call it without change in code. For example in Postgres I could find only part which can not be provided by Server-side binding due to limitation postgres or DBAPI v2 (and sometimes both), e.g. you could bind only in the limited places but when it comes up to dynamic queries you can't use Server-side binding just because it is how postgres works in limited place. But all of this places required to provide this values as part of different operators arguments.

With Postgres it is nice sample when we could do something: psycopg2.sql, server-side binging in psycopg (formally v3), however when it comes to other it might be hardly-possible to do it, personal worse sample is MySQL because we use simultaneously 3 different libraries mysql-connector-python, mysqlclient, PyMySQL and AFAIK (maybe I wrong) none of them provide such interface

potiuk commented 1 year ago

Yeah. We should look in detail at each case.

apache / airflow

Avoid possible SQL injection by refactoring string-based query construction #34252

Body

Committer