awslabs / python-deequ

Python API for Deequ
Apache License 2.0
669 stars 131 forks source link

Please add RowLevelSchemaValidator support #204

Open iWantToKeepAnon opened 1 month ago

iWantToKeepAnon commented 1 month ago

Is your feature request related to a problem? Please describe.

Knowing which rows pass and which fail is important. The "RowLevelSchemaValidator" has been in deequ for years; is there plans on creating python bindings?

Describe the solution you'd like

Access to RowLevelSchemaValidator via python.

Describe alternatives you've considered

Using spark to find nulls, out of bound numbers, too large/small strings, etc... is double work. Right now all we can tell is a pass/fail on an entire dataframe. More granular info is needed.

Additional context

Being able to run validations like the unit tests would be wonderful : https://github.com/awslabs/deequ/blob/49e970ce9a8bda5e779602d2981379b65c12ba30/src/test/scala/com/amazon/deequ/schema/RowLevelSchemaValidatorTest.scala