awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.18k stars 519 forks source link

Compliance calculation result #523

Open vaishnavibv13 opened 7 months ago

vaishnavibv13 commented 7 months ago

Describe the bug case class AnalysisBasedConstraint should have var errorMessage = s"Value: $assertOn met the constraint requirement!" because $assertOn value specifies the number of rows which met the requirement.

To Reproduce Steps to reproduce the behavior:

  1. Go to package com.amazon.deequ.constraints
  2. Click on AnalysisBasedConstraint
  3. Scroll down to pickValueAndAssert
  4. See error on else part var errorMessage = s"Value: $assertOn does not met the constraint requirement!"

Expected behavior var errorMessage = s"Value: $assertOn met the constraint requirement!"

Screenshots If applicable, add screenshots to help explain your problem.

Additional context This value specifies the number of rows that met the requirements !

Sat30 commented 4 months ago

There is no bug here. Actually we are passing some threshold value there in assertion Assertion is like

 val assertion: Double => Boolean = {
      _  >= 0.6
  }
we pass this assertion in constraint

and finally it goes like this

assertOn  = (number of row passed the constraint) / (total number of rows)
if(assertion(assertOn)) print Success
else  print Value: $assertOn does not met the constraint requirement!"