sodadata / soda-sql

Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html
https://docs.soda.io/
Apache License 2.0
60 stars 15 forks source link

Soda SQL and Soda Spark have become Soda Core

Soda SQL and Soda Spark are deprecated in favor of Soda Core and the Soda Checks Language. If you are new to Soda, start with Soda Core!

Soda logo

Data testing, monitoring, and profiling for SQL-accessible data.

License: Apache 2.0 Slack Pypi Soda SQL Build soda-sql



✔ [Install Soda SQL](/docs/soda-sql/installation.md) from the command-line
✔ Access comprehensive [Soda SQL documentation](/docs/soda-sql/overview.md)
✔ Compatible with [Snowflake, Amazon Redshift, BigQuery](/docs/soda-sql/installation.md#compatibility), and more
✔ [Write tests](/docs/soda-sql/tests.md) in a YAML file
✔ [Run programmatic scans](/docs/soda-sql/programmatic_scan.md) to test data quality

#### Example scan YAML file (Deprecated) ```yaml table_name: breakdowns metrics: - row_count - missing_count - missing_percentage ... # Validates that a table has rows tests: - row_count > 0 # Tests that numbers in the column are entered in a valid format as whole numbers columns: incident_number: valid_format: number_whole tests: - invalid_percentage == 0 # Tests that no values in the column are missing school_year: tests: - missing_count == 0 # Tests for duplicates in a column bus_no: tests: - duplicate_count == 0 # Compares row count between datasets sql_metric: sql: | SELECT COUNT(*) as other_row_count FROM other_table tests: - row_count == other_row_count ``` ## Install (Deprecated) * [Install Soda SQL](/docs/soda-sql/installation.md) * [Quick start for Soda SQL](/docs/soda/quick-start-soda-sql.md) ## Contributors ✨ Thanks goes to these wonderful people! ([emoji key](https://allcontributors.org/docs/en/emoji-key))

Vijay Kiran

πŸ’»

abhishek khare

πŸ’»

Jelte Hoekstra

πŸ’» πŸ“–

Cor

πŸ’» πŸ“–

Milan Aleksić

πŸš‡

Ayoub Fakir

πŸ’»

Alex Tonkonozhenko

πŸ’»

Todd de Quincey

πŸ’»

Antonin Jousson

πŸ’»

Jonas

πŸš‡

cwouter

πŸ’»

Janet R

πŸ“–

Bastien Boutonnet

πŸ’»

Tom Baeyens

πŸ’»

AlessandroLollo

πŸ’»

mmigdiso

πŸ’»

ericmuijs

πŸ’»

Lieven Govaerts

πŸ’»

Milan Lukac

πŸ’»

SebastiΓ‘n Villarroel

πŸ’»

Benjamin Berriot

πŸ’»

Alexey Minakov

πŸ’»
This project followed the [all-contributors](https://github.com/all-contributors/all-contributors) specification. ### Open Telemetry Tracking (Deprecated) Soda SQL collects statistical usage and performance information via the [Open Telemetry framework](https://opentelemetry.io). This information helps the Soda Core developer team proactively track performance issues and understand how users interact with the tool. The collected information is strictly limited to usage and performance and does not contain Personal Identifying Information (PII). It is used for internal purposes only. Soda keeps the data in its raw form for a maximum of five years. If some information needs to be kept for longer, it will only be kept in aggregated form. Read more about the information Soda tracks, and learn how to opt-out of tracking by consulting the [Anonymous usage statistics](/docs/soda-sql/global-configuration.md) documentation.