awslabs / python-deequ

Python API for Deequ
Apache License 2.0
691 stars 132 forks source link

Can't execute the assertion: Error while sending a command.! #55

Closed MarsSu0618 closed 4 months ago

MarsSu0618 commented 3 years ago

Hi, everyone. I have encountered a problem about check. my code follow as:

def verify_size_threshold(spark, dataframe, min_threshold):
    check = Check(spark, CheckLevel.Warning, "Review Size Check")

    checkResult = VerificationSuite(spark) \
        .onData(dataframe) \
        .addCheck(
            check.hasSize(lambda x: x >= min_threshold)) \
        .run()

    checkResult_df = VerificationResult.checkResultsAsDataFrame(spark, checkResult)

    return checkResult_df, check

and i get some Failure message, but my dataframe size is larger than min_threshold. message follow as:

constraint_status":"Failure","constraint_message":"Can't execute the assertion: Error while sending a command.!"

Hope someone can answer.

ml6cz commented 3 years ago

I had something similar, I didn't get that message specifically. I had issues with anything that uses a lambda assertion as seen here: https://github.com/awslabs/python-deequ/issues/54 .

I did have one successful run using a Glue Job to run constraints that use lambda assertions, but currently I am getting failures (not sure if there is an error on my part or something within PyDeequ was being update). Let me know if you get this to work in Glue though!

gkirubhakaran commented 2 years ago

I have the exact error for the same check

checkResult = VerificationSuite(spark) \
        .onData(dataframe) \
        .addCheck(
            check.hasSize(lambda x: x >= 1)) \
        .run()

However, it occurs only on certain dataframes and certain occasions. Is there a fix for this? Did anyone else do a work around?

ruypeter commented 2 years ago

In my case the problem appears using "hasUniqueness" check but only for one case. I have a lot of hasUniqueness checks working 100% but there's one with this error...

LorenzoCarta commented 2 years ago

We have the same problem reported by @ruypeter. This error is completely randomic but it appears only for Uniqueness checks.

chenliu0831 commented 4 months ago

This will be resolved in next release which will include https://github.com/awslabs/python-deequ/issues/169.