awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.32k stars 539 forks source link

Adding the custom constraints #488

Open DivyangPatelIITD opened 1 year ago

DivyangPatelIITD commented 1 year ago

Ask questions that don't apply to the other templates (Bug report, Feature request) Hi I want to know whether we can add some custom constraints like department_id in the student table should match with at least one department-id of the department table( foreign key constraints ) and other complex constraints which require join of multiple tables and then condition check.

zeotuan commented 7 months ago

This might not be possible yet. But somethings immediately come to my mind is to perform a left join and do a completeness check to on right side column null value. Hope this is helpful.