awslabs / python-deequ

Python API for Deequ
Apache License 2.0
713 stars 134 forks source link

Store incorrect data with a column where all the reasons for the failure #83

Open brunettig opened 2 years ago

brunettig commented 2 years ago

I have a requirement to run Data Quality Test So I am using Amazon python-deequ for this. I am able to find the Data Quality Success/Failure Status, but next I want to get all the rows that failed with a column where all the reasons for the failure are logged and Store into another DataFrame/Hive Table. Thanks

dwalter469 commented 7 months ago

This is a key feature of the system I'm building. The ability to create a Data Audit table with a key to the rows that failed and description of the failure. This feature enhancement would be a huge plus for PyDeequ.