databricks / dbt-databricks

A dbt adapter for Databricks.
https://databricks.com
Apache License 2.0
226 stars 119 forks source link

Refactor python config handling #830

Closed benc-db closed 1 month ago

benc-db commented 1 month ago

Description

Refactoring python submission components to be a little better factored. There's still definitely more cleanup to do, but my goal was basically to get the config manipulation and submission into the 3 core ways we actually submit (Workflow, Job/Run, and Command) so that we can maybe one day move to those being the submission helpers, with compute not being named in the submission method. For now we have to keep the top-level submission helpers so that we don't break any existing users.

Discussed with original author Kyle on renaming 'workflow_job_config' to 'python_job_config' so as to not confuse users when using this config without the workflow submission method.

Checklist

benc-db commented 1 month ago

Note: unit tests need significant reworking, but I wanted to get the core code in front of people, while I work on that.

benc-db commented 1 month ago

Moved to using pydantic for managing config verification/extraction, added factory methods (create) to simplify init methods. I don't like having complex logic in inits, because it makes unit testing with mocks waaay harder, but adding the factory methods make it much cleaner to build the object graphs...really wish I could pass the dependencies into the helpers, but we don't own the instantiation (happens inside dbt). Will move onto adding/updating unit tests now.

@jackyhu-db @eric-wang-1990 @kdazzle apologies for complex refactor, but python_submission.py was already a mess, and Kyle's feature added a bunch of feature support that pushed that module beyond the limits of what I can mentally process.