Closed emirkmo closed 3 months ago
I did some digging and the config approach was recently added via the configMap approach. Is there a good reason to not open it up further?
Hi @emirkmo,
Thanks for your suggestion.
I assume the lack of an id field comes from dbt way of working, and from the fact that the name of the model is supposed to be a pseudo-id?
In fact, the specification is actually following OpenAPI / JSON Schema conventions, so no id
field here. The key of the model/fields is representing the actual technical name. You could use the title
field for the display name.
If you want to use a custom field in the specification, you can do so, but Data Contract CLI would ignore it. For that, the current way - as you identified - is to use the config
map.
I am not sure, how we can proceed here. Do you have any specific suggestions?
My suggestion would be to simply to open up Field and Model to specification extension, like some other Objects already are.
That would already handle everything.
(Re: Id, we considered using title as ID but it goes against the ideal of being clear and explicit. We are happy to add it as an “Specification extension”. The CLI is already a modular library that is easy to extend :) )
OK, I updated the specification accordingly.
I am wondering if the
Model
andField
objects could be opened up to also support extra fields via specification extension. This also means adding extras_allow in thePydantic
config for theModel
field in the CLI. (In fact Field is open in the CLI, just not in the README of the spec?)Is there a reason why the Model object is only extended using a non-standard way via the
config
field? Is it to be compatible with dbt? The standard "Specification extension" can be used instead of or in addition to the flexibleconfig
field.Reason: Allow additional metadata about a model that may not fit into the "config" sub-field. Also avoid further boilerplate and nesting for fields that maybe quite valuable within the model object.
Example use case:
One that we use for example is an
id
field (for both models and fields). Having an explicitid
makes it possible to rename models unambiguously. The rename operation is automatically understood because the id has not changed, as opposed to a drop & re-create where the id would change.I assume the lack of an
id
field comes from dbt way of working, and from the fact that the name of the model is supposed to be a pseudo-id? But we really want to assign unique ids to models (tables) so that intentions are clear without a human in the loop and we don't have the luxury of re-creating tables corresponding to many terabytes of data just for a rename.