Open kokorin opened 1 year ago
@kokorin This is not aligned with our plans for checkpoint. We believe that it should focus on the validation hooks vs adding all dbt commands since you can do this via another step in the GH Action. Is there a reason you would not be able to do that?
Thank you for your reply. Originally we used dbt-checkpoint to validate changed/added models and all upstream models during pre-push hook. But recently our project grew up significantly and now we have 600+ DBT nodes. So we refused from running DBT at pre-push. Instead we validate whole project at CI.
We also use whole project at CI since we are on Gitlab + Postgres. In CI we run something like:
pre-commit run --all-files
With some help I could open PR to do so if you would be open for it?
Is it sufficient to create a file dbt_checkpoint/dbt_build.py
with something like below (pretty sure I need to tweak more, for example when a seed or snapshot is changed):
import argparse
import os
import time
from typing import Any, Dict, List, Optional, Sequence
from dbt_checkpoint.utils import (
add_config_args,
add_dbt_cmd_args,
add_dbt_cmd_model_args,
add_filenames_args,
extend_dbt_project_dir_flag,
get_config_file,
get_flags,
paths_to_dbt_models,
run_dbt_cmd,
)
def prepare_cmd(
paths: Sequence[str],
global_flags: Optional[Sequence[str]] = None,
cmd_flags: Optional[Sequence[str]] = None,
prefix: str = "",
postfix: str = "",
models: Optional[Sequence[str]] = None,
config: Dict[str, Any] = {},
) -> List[str]:
global_flags = get_flags(global_flags)
cmd_flags = get_flags(cmd_flags)
if models:
dbt_models = models
else:
dbt_models = paths_to_dbt_models(paths, prefix, postfix)
dbt_project_dir = config.get("dbt-project-dir")
cmd = ["dbt", *global_flags, "build", "-m", *dbt_models, *cmd_flags]
return extend_dbt_project_dir_flag(cmd, cmd_flags, dbt_project_dir)
def main(argv: Optional[Sequence[str]] = None) -> int:
parser = argparse.ArgumentParser()
add_filenames_args(parser)
add_dbt_cmd_args(parser)
add_dbt_cmd_model_args(parser)
add_config_args(parser)
args = parser.parse_args(argv)
config = get_config_file(args.config)
cmd = prepare_cmd(
args.filenames,
args.global_flags,
args.cmd_flags,
args.model_prefix,
args.model_postfix,
args.models,
config
)
return run_dbt_cmd(cmd)
if __name__ == "__main__":
exit(main())
I now see that this is never merged even though there was a PR for it: https://github.com/dbt-checkpoint/dbt-checkpoint/pull/152. Would you be willing to reconsider @noel ?
Describe the feature you'd like In our DBT project we use snapshot feature. On top of snapshots we build models (views) which contain monthly/weekly data. Snapshots can be build by using
dbt snapshot
anddbt build
commands.Now it's not possible to create snapshots with pre-hooks. So every user has to run
dbt build
command manually before first commit.Additional context I think it makes sense to implement pre-hooks for both
dbt snapshot
anddbt build