Closed kaaveland closed 6 months ago
I think we can probably do all of the hints we have right now, if we accept some false positives and we make some of the hints optional (eg. we can look for SET [LOCAL] lock_timout
, we can figure out that ALTER TABLE
takes AccessExclusiveLock
, that CREATE INDEX
takes ShareLock
, and that some statements ran after and we can let the user ignore f. ex the lock timeout thing via parameters or configuration. I think it's worth taking a stab at this. It would make it really easy to run the hints in a CI/CD pipeline, or a commit-hook, which may make eugene
much easier to start using.
Been taking a stab at working with the parse trees from pg_query, and I think it's very clunky. The grammar is huge and because of the structure, there's so much if let
going on. I think the best way forward here is to translate the parse tree to a smaller grammar that contains only the things we care about, then make rules for that.
Here's a small example for accessing the type of an alter table ... alter column .. set type
kind of statement, which is incomplete:
fn alters_table_lints(node: &pg_query::NodeRef) -> Vec<String> {
if let pg_query::NodeRef::AlterTableStmt(stmt) = node {
let next = &stmt.cmds;
next.iter()
.filter_map(|child| {
let node_ref = child.node.as_ref()?;
is_alters_type_cmd(&node_ref.to_ref())
})
.collect()
} else {
Vec::new()
}
}
fn is_alters_type_cmd(node: &pg_query::NodeRef) -> Option<String> {
if let pg_query::NodeRef::AlterTableCmd(cmd) = node {
let def_node = cmd.def.as_ref()?.node.as_ref()?;
if let pg_query::NodeRef::ColumnDef(def) = def_node.to_ref() {
if def.type_name.is_some() {
Some(cmd.name.to_string())
} else { None }
} else {
None
}
} else {
None
}
}
This is huge, and perhaps not the right place to start: https://www.postgresql.org/docs/current/sql-altertable.html
I have a 130 line match now, and only barely touching the surface of alter table ...
with column focus.
Like, what goes on the end of set default
or add constraint check
can be anything and it's really a lot of work to do anything like what eugene trace
can do here.
The mvp here is to:
Currently looks feasible to add useful stuff here, although it will be more prone to false positives and there's stuff that we're bound to miss that eugene trace
can pick up. Because of the difference in information that we receive here, I'm not sure we should be reusing the Hint
struct and the associated information, although I think it is probably a good idea to reuse the identifiers and names/headlines of the lints/hints as much as we can.
With #36 we're introducing a dependency to pg_query.rs, which gives us the full postgres SQL parser. It would be interesting to see if we can write some lint rules that do not depend on running the postgres server for some of the hints we currently provide. Here's a list of things we may be able to warn about using the parse tree from pg_query:
NOT VALID
:NOT NULL
CONCURRENTLY
It may also be that we can do some manual inspection of the parse tree to figure out that some statements take dangerous locks, then warn about that. However, many of these will give us false positives that we can't statically verify now, so we probably want to support some sort of comment syntax to let
eugene
ignore the statement, eg.