Closed sungchun12 closed 1 year ago
@sungchun12 Thanks for opening the issue! :)
For the initial rollout of "model governance" constructs in v1.5, I opted for more strict & opinionated across the board. We can choose to relax the constraints (hah), as folks present legitimate use cases — make it more flexible & configurable — but we can't go in reverse. I want to maintain a reasonably high standard for any proposed "relaxation," knowing that such flexibility & configurability also moves more of the complexity onto end users.
In this case, I do think you should be writing your ::
/ cast
statements within your model SQL!
One of my strongly held beliefs about dbt materializations is that there should be a clean separation between business/transformation logic (model contents), which returns a dataset, and the operational logic to actualize (create/replace/update/merge) that dataset as an object in the database. You should be able to preview (show
) a model, see the returned dataset, and trust that it is exactly what you'd see when dbt actually builds that model.
Adding implicit type coercion leads to a delta between the schema of the preview result, and the model actually being built. When that coercion is unsuccessful, it also makes those errors trickier to debug, because they'll only be encountered while materializing the model, not while previewing the compiled query.
I'm resistant to this for the same reason I'm resistant to defining a default
value within a model contract, to take the place of any null
values in the column — this should really be a coalesce
in the model SQL! (I'm more sympathetic to cases where an incremental model's schema has evolved.)
@jtcohen6 Super thoughtful! I didn't think of the developer experience in how previewing a coerced model via yaml would lead to more unnecessary friction. I wouldn't be happy weaving in changes across two files to get the job done. Explicit is better than implicit. Thanks! Glad we talked through it!
Is this your first time submitting a feature request?
Describe the feature
Problem: I want to coerce data types when defining model contracts to avoid writing out cast statements tediously.
However, I don't have an elegant mechanism to do that without writing cast statements.
Proposed Solution: Add a config to enable/disable strict mode (current behavior) similar to this example: https://docs.pydantic.dev/latest/blog/pydantic-v2/#strict-mode
Describe alternatives you've considered
Writing cast statements explicitly in my SQL.
Who will this benefit?
Model Contract users who want to be lazy like me when defining data types.
Are you interested in contributing this feature?
Not for now! That may change when I start my new job.
Anything else?
No response