Open rumbin opened 1 year ago
ππ» Hello
Thanks for addressing this issue ! π
According to dbt docs, the state modified takes into account all the types of state change and therefore the config change. Unfortunately, as stated by dbt in the caveats of the state:modified models selector, it does not play well with variables. This would have to be checked with dbt-core but I assume the same caveat applies to target (since this is behind the scene a variable).
Therefore it seems to be the expected behavior
Summary
The
external_location
config setting is respected during state comparison when running dbt with--state ... -s state:modified
option. This is troublesome in CI scenarios where we compare a development branch to themanifest.json
of the production target. As a consequence, dbt considers all models modified which make use of theexternal_location
setting in a target-specific manner.Expected behavior
The model should not be considered as being changed, if only the dynamically set
external_location
differs.We use many other target-specific settings, like, e.g. custom schemas or databases and all of that play nicely with
state:modified
. So, the observed behavior is unexpected in my eyes.How to reproduce
dev
andprod
.dbt compile --target prod
dbt ls --state target -s state:modified --target dev
Then this model will be shown as modified.
Use case
For our PII data we make use of the
external_location
setting in order to ensure that this data is written to a different S3 bucket, so we can make use of a bucket-specific access control policy.A typical model config then looks as follows:
And the corresponding
get_pii_external_location_path()
is defined as:So we always use the same designated PII bucket but tweak the paths depending on the
target.name
and an environment variableDBT_SCHEMA_NAME
, so concurrent CI runs don't clash.If we now compile the project for
--target prod
and then compare what would be run in--target dev
, we get all the PII models detected as changed: