sodadata / soda-core

:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
https://go.soda.io/core-docs
Apache License 2.0
1.91k stars 209 forks source link

Contract add quoting for identifiers #2108

Open tombaeyens opened 4 months ago

tombaeyens commented 4 months ago

Soda Core currently has a confusing way to deal with case sensitivity. By default the soda core engine tries to be neutral. If you don't quote in the YAML, the generated queries will also not be quoted. But some databases have default uppercasing strategy (snowflake) and others have a default lower casing strategy (portgres) if no quotes are provided. That is confusing.

The original goal was to be developer friendly and produce easy to read queries. And not requiring quotes lead to easier readable queries. But that goal is less important compared to making it work.

Also in contracts, we want the exact case as is in the warehouse / data source. Contracts can only have true value if they represent the metadata precisely.

So in order to achieve that, we need to ensure that we always ensure quotes and fully qualify the names in the queries we generate.

@see https://github.com/sodadata/docs/issues/812

Test contracts that require database-specific quoting. Like for case sensitive table and columns names. Also column names with spaces should be tested.

tools-soda commented 4 months ago

SAS-3765