Codd is a simple-to-use CLI tool that applies plain postgres SQL migrations atomically with strong and automatic cross-environment schema equality checks.
Other
38
stars
1
forks
source link
Allow some level of customization of schema equality checks #167
From https://github.com/mzabani/codd/issues/139, but from my own experience as well, it's clear users will want to customise schema equality checks and some times even what codd writes to disk for representations. What kinds of customization users may want is really hard to know in advance, so we need some flexible mechanism. In https://github.com/mzabani/codd/issues/139#issuecomment-1657174969 I had what might be a fruitful idea to address this issue:
"To ignore column order, for instance, It'd something like codd [up|verify-schema] --ignore "schemas//tables//cols/*:.order. The part before the colon references one or more files and the part after it is a jq-esque filter to choose what to ignore. Like you suggested, this option could be used multiple times (the comma separation might be tricky for this one)."
Let's put this idea to the test with concrete use cases, supposing codd has two environment variables, CODD_DB_SCHEMA_TRANSFORM, CODD_SCHEMA_COMPARISON_TRANSFORM that represent a set of jq-esque transformations that codd will apply. The DB ones will be applied to representations extracted from the database in every operation - including codd add/write-schema/up/verify-schema. This should be the most used option for a lot of things, I expect/hope.
And if users want to keep representations more "full" on disk but be more lax when comparing them, they can use CODD_SCHEMA_COMPARISON_TRANSFORM, which will do that only when comparing expected and actual schemas, and apply the transformation to both actual and expected schemas.
How would we for instance emulate CODD_SCHEMA_ALGO="strict-collations"?
For the default ignore option in codd, we'll need to add a new is_range_ctor: bool field to the representation of functions, so that we can differentiate range constructors from every other function.
The transformation in codd would then be:
CODD_DB_SCHEMA_TRANSFORM=schemas/*/routines/*:if (.is_range_ctor) then del(.owner,.privileges) * { privileges: .privileges | to_entries | map({key:.key,value:.value | map([.[0],""])}) | from_entries } else . end
Array types and row types: could codd write these to disk and ignore entire files by default?
I think it wouldn't be unreasonable to treat a pure null value as a file to be deleted. And in that case, the default ignore option for array types and row types would be something like (assuming attributes exist to identify these kinds of types):
CODD_DB_SCHEMA_TRANSFORM=schemas/*/types/*:if (.is_array_type or .is_row_type) then null else . end
Problem:
https://github.com/jqlang/jq/issues/2829 mentions there is no documented or stable API in libjq. We might want to consider jmespath or some other json query language. I found no query language libraries in hackage, so linking against an existing library might be the way to go.
I couldn't find a C library for jmespath, but there's a Rust one. It apparently doesn't expose a C API, but there's an article on Haskell<->Rust FFI suggesting we should be able to do this. I suspect it'll be another compiler/toolchain when compiling codd..
From https://github.com/mzabani/codd/issues/139, but from my own experience as well, it's clear users will want to customise schema equality checks and some times even what codd writes to disk for representations. What kinds of customization users may want is really hard to know in advance, so we need some flexible mechanism. In https://github.com/mzabani/codd/issues/139#issuecomment-1657174969 I had what might be a fruitful idea to address this issue:
"To ignore column order, for instance, It'd something like codd [up|verify-schema] --ignore "schemas//tables//cols/*:.order. The part before the colon references one or more files and the part after it is a jq-esque filter to choose what to ignore. Like you suggested, this option could be used multiple times (the comma separation might be tricky for this one)."
Let's put this idea to the test with concrete use cases, supposing codd has two environment variables,
CODD_DB_SCHEMA_TRANSFORM
,CODD_SCHEMA_COMPARISON_TRANSFORM
that represent a set of jq-esque transformations that codd will apply. The DB ones will be applied to representations extracted from the database in every operation - includingcodd add/write-schema/up/verify-schema
. This should be the most used option for a lot of things, I expect/hope. And if users want to keep representations more "full" on disk but be more lax when comparing them, they can useCODD_SCHEMA_COMPARISON_TRANSFORM
, which will do that only when comparing expected and actual schemas, and apply the transformation to both actual and expected schemas.CODD_SCHEMA_ALGO="strict-collations"
?The jq-esque transformation for this would be:
For the default ignore option in codd, we'll need to add a new
is_range_ctor: bool
field to the representation of functions, so that we can differentiate range constructors from every other function. The transformation in codd would then be:pg_dump . pg_restore
not preserving function bodies)null
value as a file to be deleted. And in that case, the default ignore option for array types and row types would be something like (assuming attributes exist to identify these kinds of types):Problem:
https://github.com/jqlang/jq/issues/2829 mentions there is no documented or stable API in libjq. We might want to consider jmespath or some other json query language. I found no query language libraries in hackage, so linking against an existing library might be the way to go.
I couldn't find a C library for jmespath, but there's a Rust one. It apparently doesn't expose a C API, but there's an article on Haskell<->Rust FFI suggesting we should be able to do this. I suspect it'll be another compiler/toolchain when compiling codd..