Closed gouline closed 2 years ago
Hey @gouline - kind of funny we didn't already have an issue for this one :)
We've kicked around the idea of providing a public API for dbt in Python a couple of times now. I'm happy for us to add a public method like dbt.main.execute(command='run', profiles_dir=PROFILES_DIR, ...)
which implements logic similar to handle_and_check
.
I'd still like to provide a rich Python-based interface for running dbt projects in the future. I think that would entail model selection, configuration, execution, etc, etc. In this case though, I think a top-level method like execute
(or similar) gets us moving in the right direction.
Thanks for raising this!
Hello I like to help out on this issue. Any pointers on how this should be implemented?
Should it just pass on the args to the parser?
This one is pretty old. Is there any progress on this?
Is there something I/we could help with?
sys.exit()
raises SystemExit exception, so we can use a try-except
statement to handle this situation.
Here is example;
https://github.com/apache/airflow/blob/8505d2f0a4524313e3eff7a4f16b9a9439c7a79f/airflow/cli/commands/config_command.py#L40-L44
https://github.com/apache/airflow/blob/8505d2f0a4524313e3eff7a4f16b9a9439c7a79f/tests/cli/commands/test_config_command.py#L60-L80
To catch stdout/stderr, we can use contextlib.redirect_stdout
/contextlib.redirect_stderr
decorator.
with contextlib.redirect_stdout(io.StringIO()) as temp_stdout:
__import__('dbt.main').main(argv)
+1 Is there any progress on this?
+1 Is there any progress on this?
I suspect there won't be much progress on this now that DBT Cloud's a thing.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.
Hello I like to help out on this issue. Any pointers on how this should be implemented?
Should it just pass on the args to the parser?
+1
This is a fairly old issue. We are finally making progress in this direction, providing a programmatic API into dbt Core: https://github.com/dbt-labs/dbt-core/issues/5527
The ability to invoke dbt as a module (instead of CLI script) isn't explicitly in scope for that initiative, but the big idea— providing a more sensible "main" method as entry-point to Core execution—certainly is.
any updates on this? Seems like it would be easy to implement given how all of dbt-core is python already.
@manugarri Check out:
This issue has better google results than the actual feature's docs, so I'm going to leave a pointer here: https://docs.getdbt.com/reference/programmatic-invocations
Describe the feature
It would be nice to have a straightforward way of running dbt as a Python module.
To give some context, I use my own build tool https://github.com/gouline/molot that's written in Python and provides basic support for targets, dependencies and arguments. Something half way between Make and Gradle that helps with CI configuration. Things that can be done in Python (e.g. boto3, snowflake-connector-python) are done natively, everything else it just calls shell commands in a subprocess like a Makefile would.
So I'd like a way to invoke multiple dbt commands (e.g.
run
anddocs generate
) in one target by just importingdbt.main
and calling a function multiple times. The way things are now,main()
callssys.exit()
, which exists my outer build as well, andhandle_and_check()
requires raw command-line arguments passed manually as a list. Ideally, there should be a function of the formdbt.main.execute(command='run', profiles_dir=PROFILES_DIR, ...)
that would execute a command and throw any exceptions for the caller to handle.Describe alternatives you've considered
My current workaround is just invoking dbt command as a subprocess, but it seems like a backward way of doing it, considering that both are Python applications.
Additional context
Closest thing I could find on the issue tracker was https://github.com/fishtown-analytics/dbt/issues/1488, but that sounds more complex than what I'm proposing.
Who will this benefit?
This would benefit anyone who orchestrates their builds with Python scripts.