dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
https://getdbt.com
Apache License 2.0
9.99k stars 1.63k forks source link

[Feature] Add `target_path` to `global_flags` #9575

Open dbeatty10 opened 9 months ago

dbeatty10 commented 9 months ago

Is this your first time submitting a feature request?

Describe the feature

From @graciegoheen here:

Like looking at this page for target-path, I don't see anything for dbt retry - is that expected @ChenyuLInx ?

This is explained by@p.target_path not being included within the flags for dbt retry.

The easiest way to "solve" this would be adding @p.target_path to the dbt retry command here.

But the proposed way to implement this would be to add it to the global_flags and remove it from each of the sub-commands.

Current status

Currently, @p.target_path is included for all sub-commands except for:

image

Describe alternatives you've considered

A couple alternatives:

Who will this benefit?

If dbt retry produces any artifacts to the target directory, then this flag would make that location configurable.

Are you interested in contributing this feature?

No response

Anything else?

This would resolve https://github.com/dbt-labs/dbt-core/issues/8948 also.

dbeatty10 commented 9 months ago

[!TIP] All the << EOF business below is the start of a "here document" -- just a handy way to write dbt model files on the fly for reprex purposes.

dbt retry

Since dbt retry writes files to the target_path directory, it should be configurable via DBT_TARGET_PATH or --target-path just like nearly all other dbt sub-commands.

Acceptance criteria 1

Users should be able to set the DBT_TARGET_PATH environment variable and have it work with subsequent retries:

export DBT_TARGET_PATH=artifacts
cat << EOF > models/my_model.sql
select 1/0 as id
EOF
dbt run -s my_model
cat << EOF > models/my_model.sql
select 1 as id
EOF
dbt retry

Instead, they will get this error:

01:10:28  Encountered an error:
Runtime Error
  Could not find previous run in 'target' target directory

Acceptance criteria 2

Put another way...

Suppose you start with some bad SQL, run it, and then fix it:

dbt clean
rm -rf target
rm -rf artifacts
cat << EOF > models/my_model.sql
select 1/0 as id
EOF
dbt run -s my_model --target-path target
mkdir -p artifacts
mv target/run_results.json artifacts/
rm -rf target
cat << EOF > models/my_model.sql
select 1 as id
EOF

This command works, and it writes output files to the target directory:

dbt retry --state artifacts

We should make it so this command writes output files into the my_target_path directory:

dbt retry --state artifacts --target-path my_target_path

Something to consider

As shown in the logic here:

When these are the same directory, then the newest run_results.json file will overwrite the pre-existing one.

Maybe this is partially why there was the following warning message in 1.7.4 and earlier? Or maybe that warning was there for different reasons?

Warning: The state and target directories are the same: 'target'. This could lead to missing changes due to overwritten state including non-idempotent retries.

This warning was added in https://github.com/dbt-labs/dbt-core/pull/8638. Then the warning disappeared following https://github.com/dbt-labs/dbt-core/pull/9328 (which resolved https://github.com/dbt-labs/dbt-core/issues/9327).

Separate acceptance criteria

In the case of successive executions of dbt retry, we should make sure that it is okay that run_results.json is overwritten.

We should probably move this separate acceptance criteria to its own issue, but leaving it here for sake of expedience.

See below for an example of successive executions of dbt retry. I'm not sure or if it is a sufficient example to show that it's okay for run_results.json to be overwritten each time.

cat << EOF > models/my_model.sql
select 1/0 as id
EOF
dbt run -s my_model
dbt retry
cat << EOF > models/my_model.sql
select 1 as id
EOF
dbt retry
dbeatty10 commented 9 months ago

Decision

@aranke and I discussed this over a video call today and decided to add this to the global_flags and remove it from each of the sub-commands.

Here are the reasons: