FRosner / drunken-data-quality

Spark package for checking data quality
Apache License 2.0
222 stars 69 forks source link

Arbitrary checks with lambdas #126

Open FRosner opened 7 years ago

FRosner commented 7 years ago

Description

It would be nice to allow the user to define arbitrary checks which will also be included into the reports. This can be achieved by allowing arbitrary code and wrapping it into a try catch internally. It should return the Try so you can work with it later. If the Try contains an exception the check will be marked as failed.

Usage Sketch

val tryRdd = Check.custom("file exists", {
  spark.textFile("myfile.txt")
})
tryRdd.map(_.count) // do something with the read RDD

Questions