scttnlsn / sql-critic

Capture and analyze SQL queries made during a run of your app's test suite
0 stars 0 forks source link

Seq scan analyzer #2

Closed scttnlsn closed 1 year ago

scttnlsn commented 1 year ago

Does an EXPLAIN on the query to determine if there are any potential sequential scans.

Scratch code I was playing around with for PostgreSQL:

import psycopg2
import json
import re

conn = psycopg2.connect("postgresql://postgres@localhost:5432/test_schema")

cur = conn.cursor()

with open("schema.sql") as f:
    sql = f.read()

# cur.execute(
#     """
#     DROP SCHEMA public CASCADE;
#     CREATE SCHEMA public;
#     GRANT ALL ON SCHEMA public TO postgres;
#     GRANT ALL ON SCHEMA public TO public;
#     """
# )

cur.execute(sql)

# Postgres might not use an index when there's not much (no) data
cur.execute("SET enable_seqscan = OFF;")

# FIXME: what if a different schema is being used?
cur.execute("SET search_path TO public;")

query = """
SELECT * FROM foo;
"""

# find the number of parameters in the query
r = re.compile(r"(\$\d+)")
n = len(r.findall(query))

# `unknown` type for each parameter
args = ", ".join(["unknown"] * n)

cur.execute("SET plan_cache_mode = force_generic_plan;")
cur.execute(f"PREPARE stmt({args}) AS {query}")
cur.execute(f"EXPLAIN (FORMAT JSON) EXECUTE stmt(NULL, NULL, NULL)")

res, = cur.fetchone()
print(json.dumps(res))

cur.execute("DEALLOCATE stmt;")

conn.rollback()
conn.close()
scttnlsn commented 1 year ago

Probably makes sense to just connect to the testing database after the tests run. Since this is run on a specific SHA we'll probably need to store the results somewhere for a later comparison.

scttnlsn commented 1 year ago

Got the basics of this going in 8f86043e0ab8901d051f81fb8528e7c3b377446b (among other commits)