Open b-per opened 1 year ago
Thanks @b-per! There's been previous discussion about this:
The most recent comment (from February) has the exact same request & use case, and I think I buy it. When you have multiple developers working on a single "feature" over the course of a few weeks, you may want them to:
dev_feature_xyz
)target.schema
, so it doesn't clobber their concurrent work on other features (e.g. dev_jerco_feature_xyz
)dbt <> git interaction. dbt doesn't install git as a Python package dependency; it just uses the git
available in the OS, and shells out to it inside a subprocess. This can get pretty gross. Right now, all git
interactions are limited to dbt deps
, and quite unrelated to all other dbt functionality. But if we were to start running git
commands as part of resolving Jinja context methods... There is pygit2
, which might be a better / lighter-weight way to do something as simple as "tell me the current branch name"?
Partial parsing. I think this could have some wacky interactions with partial parsing. If you change your git branch, and you use the git_branch
variable in your custom generate_schema_name
macro (which is resolved at parse time) — in order to re-resolve all those schema configs, either:
--no-partial-parse
)generate_schema_macro
depends on this variable, that the variable has been modified, and then triggers a full re-parse accordinglyI am a bit stuck on the partial parsing side of things but the rest (feature and testing) should be OK.
I have also added a git_sha
variable for the latest sha based on git log
. I think that it could be useful in query_comment
etc...
Thinking about it now, would we want to add this info in run_results.json
and/or manifest.json
as well? Would we want to have the branch information in the Metadata API for example?
I'm commenting here since I was tagged on the PR. First off, thanks for opening this discussion and a corresponding PR, @b-per!
The big question I have from reading this issue is:
Why is saving the Git branch in an
env_var
(similar to https://stackoverflow.com/a/10915331) not a viable alternative?
A few other considerations from an engineer's perspective (more for @jtcohen6):
dbt-core
and be responsible for vendoring and distributing it?
dbt-core
and git
closer together, or do we want to keep them more independent?And one more thing: can we experiment with using Dulwich in the PR? From the homepage:
Dulwich is a Python implementation of the Git file formats and protocols, which does not depend on Git itself.
All functionality is available in pure Python. Optional C extensions can be built for improved performance.
This might be a way to mitigate the dbt <> git interaction
called out above, but will probably be much slower (and maybe that's ok?).
I'm excited to see where this discussion goes and what solution we come up with!
My personal takes:
pygit2
seems to have a very lightweight requirements list on its own so I don't think that this would create a dependency messHi There, I'm looking for the exact feature described here. The goal is to generate models in schemas created based on current git branch name, without setting the variable manually after switching to another branch. @b-per Did you guys figure out anything in this topic? I would be grateful for any clue or info about status of this enhancement
Hi I'm here from searching for "dbt run set target to current git branch".
Looks like there isn't a native way to do this interactively. In github actions CICD the pull step is going to select a branch and that will be accessible as a variable for dbt commands, so yeah my use-case is also REPL development workflow.
Here's how I solve it with the just.systems taskrunner:
I have a just environment that computes the current git branch name, which also match the names of my targets.
# ./justfile
set positional-arguments := true
git_branch := `git symbolic-ref --short HEAD`
# select dev branch if not on main or stable (ie feature branches, etc)
branch := if (git_branch) == "main" { "main" } else if (git_branch) == "stable" { "stable" } else { "dev" }
current_git_commit := `git rev-parse HEAD | cut -c 1-8`
# just --list
_default:
@just --list --unsorted
# dbt run --target <git branch name is inserted here> <the rest of your command>
run *args:
dbt run --target {{branch}} "$@"
The just
commands insert the target flag and branch name ahead of my dbt run commands.
# currently 'dev' branch
just run
# dbt run --target dev "$@"
Is this your first time submitting a feature request?
Describe the feature
I would like to be able to access the current git branch name from the dbt context in order to be able to run some Jinja code depending on its value.
If we had
current_branch
available as a Jinja variable, we could potentially generate models in schemas/database (by adding some logic ingenerate_schema_name()
) that depend on this branch name.This could be useful for longer living branch where multiple developers will work on the same feature.
We would need to handle the case where dbt is run on code that is not a git repo/branch and maybe return an empty value then.
Describe alternatives you've considered
People could potentially define a
var
indbt_project.yml
to hard code what branch they are on. This variable would then have the same value as the current branch but the drawbacks are that:var
is defined as each branch would define its own valueWho will this benefit?
Are you interested in contributing this feature?
Yes!
Anything else?
No response