frictionlessdata / data-quality-spec

A spec for reporting errors in data quality.
MIT License
20 stars 3 forks source link

Data Quality Spec

Build Status

A simple spec that describes data quality errors common to tabular data files.

Why?

There are many commonly recognised errors that can result from working with tabular data; these are often encountered in the open data world when working with CSV files.

Tools like GoodTables seek to identify such errors and return helpful information to the user in order to fix her data files.

Such tools also go beyond checking for basic structural problems in tabular data files, and extend to detecting "schema" problems: essentially, issues with the consistency of the data itself.

The spec herein extracts the errors detected by GoodTables from the codebase, tidies them a little based on what we've learned, works towards making them available in a form for reuse outside of their original home.

What

spec.json has errors object contains errors are keyed by code (as string), where each error has the following properties:

Take a look for yourself at spec.json.