FRosner / drunken-data-quality

Spark package for checking data quality
Apache License 2.0
222 stars 69 forks source link

Log4j reporter #69

Closed FRosner closed 8 years ago

FRosner commented 8 years ago

Format

One line per constraint the log message contains a JSON document with the following fields that are common for all constraint results:

Name Key Explanation
Check UUID id ID of the check of this constraint. This can be used to later group all constraints belonging to one check.
Check Time time Timestamp when the check was reported (dow mon dd hh:mm:ss zzz yyyy, see java.util.Date).
Check Name name Name of the data frame (or the display name if specified) that is checked.
Number of Rows rowsTotal Number of rows in the checked data frame.
Constraint Type constraint Type of constraint (e.g. number of rows, primary key, functional dependency, etc.).
Constraint Status status Whether the constraint check was a success or failure.
Message message Message for this constraint (text).

Based on the specific constraint, additional fields will be added.