awslabs / python-deequ

Python API for Deequ
Apache License 2.0
713 stars 134 forks source link

[Pydeequ 1.0.1] pydeequ.checks.isContainedIn does not accept lambda assertion #88

Open BrunoFavie opened 2 years ago

BrunoFavie commented 2 years ago

Describe the bug When running Pydeequ 1.0.1 the test generated by ConstraintSuggestionRunner include tests using the isContainedIn() function that fail during execution.

The cause is that the suggested tests include a lambda assertion which the python function does not accept as it takes 3 positional arguments but the suggested tests has 5

To Reproduce Steps to reproduce the behavior:

  1. Generate a test statement using a dataset that is incomplete, resulting in a suggestion for a test using isContainedIn() which uses a lambda:
  1. Execute the test
  2. Check output for error:
    • TypeError: isContainedIn() takes 3 positional arguments but 5 were given

This issue has been reported before: https://github.com/awslabs/python-deequ/issues/65

The cause is that the current implementation of the isContainedIn was edited in https://github.com/awslabs/python-deequ/commit/30375bb8645728a539b7b2f6d2d85f89266ac047#diff-783716851e9837b9753e643de1f15e031f79bed4ef27e07ce67eeddc5a3fb2ee but the ConstraintSuggestionRunner was not updated to match the latest implementation.

It is unclear to me whether the suggested test is valid and the isContainedIn function needs to be extended or whether the change was made for a reason and thus the ConstraintSuggestionRunner should be adjusted to leave out the broken tests.

A previously made pull request does show how to revert the change: https://github.com/awslabs/python-deequ/pull/58

jagdeepkalsi commented 2 years ago

+1

This one is pretty important.

SarbajitNandy commented 2 years ago

I looked at the Check class. `isContainedIn() function takes only two args. 1st - value, 2nd - list of values. It doesn't support any assertions.

didopimentel commented 1 year ago

+1 to this. The suggestion seems to have a bug where code_for_constraint is not valid.