encode / databases

Async database support for Python. 🗄
https://www.encode.io/databases/
BSD 3-Clause "New" or "Revised" License
3.8k stars 260 forks source link

string queries and ORDER BY clause #442

Open rsurgiewicz opened 2 years ago

rsurgiewicz commented 2 years ago

Hi there,

fist of all: big kudos for the module! Guys you've done an amazing piece of work here!

I have a problem or possibly a bug. postgres: When using parameterized text based query order by clause parameters are not working (are ignored)

My simplified query: SELECT * FROM "sourceTable" WHERE 1=1 ORDER BY :orderBy DESC LIMIT :topn

then I use: await database.fetch_all(query=QRY, values = {"orderBy": "startTime","topn":1}

and debug says: DEBUG:databases:Query: SELECT * FROM "sourceTable" WHERE 1=1 ORDER BY $1 DESC LIMIT $2; Args: ('startTime', 1) but results are not ordered at all.

One can provide a non-existing column and query is still execuded, e.g., DEBUG:databases:Query: SELECT * FROM "rfb_active_tests" WHERE 1=1 ORDER BY $1 LIMIT $2; Args: ('who_ate_the_cookies_from_the_cookie_jar ', 1) and results are the same.

I've tried several approaches including ASC or DESC in the orderBy parameter itself or by providing a Tuple, but apparently none worked.

Unfortunately I have to use text SQL queries. Is it a bug or am I doing something wrong?

How shall I pass the order by parameters (REST param) to avoid e.g SQLInjections then?

Thanks

aminalaee commented 2 years ago

Hi,

Can you provide a minimal example where we can see what is the current output and what is the expected output?

It will be much easier and faster to help.

rsurgiewicz commented 2 years ago

thank you for quick reply:

here is data table: id startTime avgResult
zX2Rr0Vc1E 2021-12-09 13:49:48.141000 6428932
DSN54vThdq 2021-12-09 13:54:00.265000 7088159
GGUVUITrgU 2021-12-09 14:11:46.921000 4661406
r3p8RVu3SB 2021-12-10 11:27:43.446000 21975032.5

python:

        QRY = """
                SELECT id FROM sourcetable WHERE 1=1 ORDER BY :orderBy DESC LIMIT :topN;
                """

        QRY_PARAM_DICT = {
            "orderBy": '"startTime"',
            "topN": 1
        }
        output = await database.fetch_all(query=QRY, values = QRY_PARAM_DICT)

debug: DEBUG:databases:Query: SELECT id FROM sourcetable WHERE 1=1 ORDER BY $1 DESC LIMIT $2; Args: ('"startTime"', 1)

Result: id = zX2Rr0Vc1E id startTime avgResult
zX2Rr0Vc1E 2021-12-09 13:49:48.141000 6428932

Expected result: id = r3p8RVu3SB

id startTime avgResult
r3p8RVu3SB 2021-12-10 11:27:43.446000 21975032.5
aminalaee commented 2 years ago

Sorry I forgot to ask, which database driver are you using?

rsurgiewicz commented 2 years ago

psycopg2 (psycopg2-binary 2.9.3 to be more precise)

aminalaee commented 2 years ago

I'm not sure if it's really related to this issue, but Databases is supposed to be used with an async db driver like asyncpg or aiopg for PostgreSQL.

collerek commented 2 years ago

It might be related to #139. I know that in some cases when you pass parameters outside of columns scope (like where clause or order clause) databases passes all values to columns instead of splitting them by name etc.

aminalaee commented 2 years ago

I just tested that fix but that doesn't solve the issue.

rsurgiewicz commented 2 years ago

Did you try to replicate the issue?

Meanwhile, I've tried with asyncpg driver, but I observe same misbehavior.

aminalaee commented 2 years ago

yes, that fails on other drivers too, I've tried sqlite.

rsurgiewicz commented 2 years ago

Ok, I see.

Looking on the bright side; at least it's replicable.

If there's a sql-injection-safe workaround that could be used, please let me know. thx

aminalaee commented 2 years ago

I think this is the behaviour of bindparams in SQLAlchemy, which is correct. Databases uses bindparams to set query parameters. An example from SQLAlchemy:

stmt = text("SELECT id, name FROM user WHERE name=:name "
            "AND timestamp=:timestamp")

stmt = stmt.bindparams(name='jack',
            timestamp=datetime.datetime(2012, 10, 8, 15, 12, 5))

The bindparams does not set columns, it sets column properties. I thought about passing literal columns instead, something like this:

from sqlalchemy.sql import literal_column

result = await db.fetch_all("SELECT * FROM foo ORDER BY :c", values={"c": literal_column('bar')})

But the aiosqliue doesn't handle that, and that's I think expected.

As a workaround you might be able to get SQLAlchemy query expressions to do this, I know it's not always the case:

from sqlalchemy import select

stmt = select(X).order_by(x)
result = await db.fetch_all(stmt)

What do you think?

rsurgiewicz commented 2 years ago

Thanks @aminalaee,

I've tried the sqlalchemy select approach.

Even though it produces a list of databases.backends.postgres.Record objects, but when I try to read the data from it or convert it using either dict(onerecord) or {**onerecord} I get KeyError e.g.,

File .... site-packages/databases/backends/postgres.py", line 139, in __getitem__
    idx, datatype = self._column_map[key]
KeyError: 'id'

(id is column name in query)

---- EDIT ----- Oh I see, with sqlalchemy than I need to implement the ORM based approach. Unfortunately this is not applicable here.