dbcli / pgspecial

Python implementation of postgres meta commands (backslash commands)
BSD 3-Clause "New" or "Revised" License
74 stars 54 forks source link

Generate sql commands using `psycopg.sql` #134

Open dvarrazzo opened 1 year ago

dvarrazzo commented 1 year ago

Hello,

my eye stumbled on #133 and it made me notice that commands are generated using string composition. This has a series of problems: the missing quotes one, but, for instance, this command wouldn't handle a schema or sequence name containing a double quote.

If you think it's ok, I can propose a MR to replace all the string composition using the psycopg.sql objects, which are designed to overcome this problem, or can assist someone to prepare one (my times might be long, I'm kinda busy at the moment)

The type of changes needed would be similar to these **Untested:** ```diff diff --git a/pgspecial/dbcommands.py b/pgspecial/dbcommands.py index 904faac..c52a2ba 100644 --- a/pgspecial/dbcommands.py +++ b/pgspecial/dbcommands.py @@ -889,18 +889,20 @@ def describe_table_details(cur, pattern, verbose): def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): if verbose and cur.connection.info.server_version >= 80200: - suffix = """pg_catalog.array_to_string(c.reloptions || array(select - 'toast.' || x from pg_catalog.unnest(tc.reloptions) x), ', ')""" + suffix = SQL("""pg_catalog.array_to_string(c.reloptions || array(select + 'toast.' || x from pg_catalog.unnest(tc.reloptions) x), ', ')""") else: - suffix = "''" + suffix = Literal('') if cur.connection.info.server_version >= 120000: - relhasoids = "false as relhasoids" + relhasoids = SQL("false as relhasoids") else: - relhasoids = "c.relhasoids" + relhasoids = SQL("c.relhasoids") + + params = {"suffix": suffix, "relhasoid": relhasoid, "oid": Literal(oid)} if cur.connection.info.server_version >= 100000: - sql = f"""SELECT c.relchecks, c.relkind, c.relhasindex, + sql = SQL("""SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, c.relhastriggers, {relhasoids}, {suffix}, c.reltablespace, @@ -911,10 +913,10 @@ def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): c.relispartition FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid) - WHERE c.oid = '{oid}'""" + WHERE c.oid = {oid}""") elif cur.connection.info.server_version > 90000: - sql = f"""SELECT c.relchecks, c.relkind, c.relhasindex, + sql = SQL("""SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, c.relhastriggers, c.relhasoids, {suffix}, c.reltablespace, @@ -925,10 +927,10 @@ def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): false as relispartition FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid) - WHERE c.oid = '{oid}'""" + WHERE c.oid = {oid}""") elif cur.connection.info.server_version >= 80400: - sql = f"""SELECT c.relchecks, + sql = SQL("""SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, @@ -941,10 +943,10 @@ def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): false as relispartition FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid) - WHERE c.oid = '{oid}'""" + WHERE c.oid = {oid}""") else: - sql = f"""SELECT c.relchecks, + sql = SQL("""SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, @@ -957,7 +959,11 @@ def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): false as relispartition FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid) - WHERE c.oid = '{oid}'""" + WHERE c.oid = {oid}""") + + # Calling `as_string()` here only because the query is passed to + # `log.debug()` too. `cur.excute()` can accept the SQL object directly. + sql = sql.format(**params).as_string(cur) # Create a namedtuple called tableinfo and match what's in describe.c @@ -971,8 +977,10 @@ def describe_one_table_details(cur, schema_name, relation_name, oid, verbose): # If it's a seq, fetch it's value and store it for later. if tableinfo.relkind == "S": # Do stuff here. - sql = f"""SELECT * FROM "{schema_name}"."{relation_name}""" - log.debug(sql) + sql = SQL("""SELECT * FROM {schema}.{rel}""").format( + schema=Identifier(schema_name), rel=Identifier(relation_name)) + + log.debug(sql.as_string(cur)) cur.execute(sql) if not (cur.rowcount > 0): return None, None, None, "Something went wrong." ```

Notice that using the SQL, Identifier, Literal objects etc. there is no more the need to first compose a query as string, with placeholders, and then execute it with params (i.e.. cur.execute(sql % sql_params, params)): SQL.format() can take SQL snippets, identifiers, or literal values and will use the right quotes to compose a query (cur.execute(sql.format(**all_params)))

A problem I see here is with testing: what is test coverage, in terms of supporting all database versions and calling all the commands? If it's low it's dangerous. Maybe it's worth dropping support for old Postgres versions? Psycopg doesn't (officially) support PostgreSQL < 10. Certain things work, but it's untested.

amjith commented 1 year ago

Thank you! This is a good catch. I'll get started with this migration.

Regarding our coverage, it looks like we're only testing this against Postgres 10. https://github.com/dbcli/pgspecial/blob/main/.github/workflows/ci.yml#L18

We have a best effort approach with older versions.

j-bennet commented 1 year ago

@dvarrazzo @amjith

We're using postgres:10 image in CI, and in fact, the official Postgres support for it ended in November 2022 (2 months ago), so we should switch to 11 soon.

I'm going to open a PR to add more PostgreSQL versions into the CI build matrix. I think it makes sense to at least test with oldest and newest officially supported by Postgres (11 and 15), but we can add more.