datacontract / datacontract-specification

The Data Contract Specification Repository
https://datacontract.com/
MIT License
229 stars 35 forks source link

Deprecate schema #21

Open simonharrer opened 8 months ago

simonharrer commented 8 months ago

Status Quo

There is the top-level element schema that references a technical schema like SQL, BigQuery, Avro, JsonSchema, etc.

Motivation

We introduced the top-level element models that encodes an abstract, high-level version of the data model. Mapping models to the technical schema, and back, is being implemented in tooling like the Data Contract CLI (cli.datacontract.com).

Proposal

Mark schema as deprecated for the next version and remove it in the version after that.

Alternatives

  1. Add the schema to a server object to put the physical/technical schema where the data contract becomes real. But to me, the server object is more about connecting than about describing everything of the server and its data.
  2. Keep schema and see what users will do with it.
  3. Feel free to add here
tarys commented 7 months ago

Hi, As promised in our previous conversation, here are some thoughts on this topic:

  1. In a sense, model could be seen as "anti-corruption layer" or "adapter" on top of the non-abstract schema .
  2. As higher-order abstraction, it needs additional tooling to translate into actual schema (Snowflake, Kafka, BigQuery, etc). Seems like datacontract-cli is a reasonable candidate for such functionality.
  3. To make model -> schema mapping efficient and easy to implement by contributors it makes sense to use "pluggable" architecture (e.g. providers in Terraform)
  4. With all described above, models section needs its own clear and stable specification, because it in fact becomes a DSL within already existing overall Data Contracts specification.
pixie79 commented 4 months ago

I would not have models and schema, it will lead to contradictions between the two quite quickly if people are not careful. Either that or the validation tooling needs to ensure that each entity in one matches the other. I think import / export tooling is a much better idea.

emirkmo commented 3 days ago

Would it be possible to remove schema but retaining the flexibility via the tooling by allowing for further customization of Server types?

Import/Export tooling can replicate the flexibility of schema while keeping the clarity of logical data models, by customizing application of the Model based on the Server. However with Schema now gone (and I support @pixie79 in not having both), there's no longer a way to explicitly contract on the resulting schema if the Server type isn't supported by the specification and the datacontract-cli.

See: https://github.com/datacontract/datacontract-cli/issues/416