Open keraion opened 3 months ago
Nothing has changed with regard to exceptions in dbt, so I'm guessing that something changed in sqlfluff. Did sqlfluff not used to run dbt in a multi-processing context? Last time somebody reported a sqlfluff multi-processing related issue, it was just loading the code, not attempting to execute dbt commands.
I'm not sure this is "new" per se but newly discovered when hitting dbt exceptions within multiprocessing. The pre-commit hooks for sqlfluff default to running in the multi-process mode which have been seeing more usage. You can see the reduce pattern on the cpython's JSONDecodeError.
Hey @keraion, thank you for taking the time to diagnose and open this issue.
dbt does not officially support parallel execution, and it would be quite a large undertaking to do so. We'd like to get there gradually, and it sounds like implementing __reduce__
on our exceptions could be a step along the way, but we still wouldn't guarantee safe multiprocessing support at that point.
I'm going to tag this as help_wanted
to indicate this isn't something the maintainer team is prioritizing but would be open to an external contribution towards.
Is this a new bug in dbt-core?
Current Behavior
While debugging sqlfluff/sqlfluff#6037, dbt appears to hang if a dbt exception is raised. The exception appears to not be able to be pickled and prevents further execution.
Expected Behavior
The exceptions should implement
__reduce__
to allow pickling and prevent hanging.Steps To Reproduce
For these reproduction steps I'm using
dbt-duckdb
, but applies to all adapters.Using the example models, make the first model raise a compilation error:
Call
dbt run
from a python multiprocessing context.def run_dbt(): ctx = cli.make_context(cli.name, ["run"]) cli.invoke(ctx)
with mp.Pool() as pool: pool.apply(run_dbt)
Environment
Which database adapter are you using with dbt?
other (mention it in "Additional Context")
Additional Context
As noted above, using
dbt-duckdb
The main entry point for this error will most likely be thesqlfluff-templater-dbt
In sqlfluff, monkeypatching
__reduce__
prevents the process from hanging.