Nike-Inc / spark-expectations

A Python Library to support running data quality rules while the spark job is running⚡
https://engineering.nike.com/spark-expectations
Apache License 2.0
148 stars 32 forks source link

[FEATURE] Add a range expression to Aggregation data quality expectation #81

Closed vigneshwarrvenkat closed 2 months ago

vigneshwarrvenkat commented 3 months ago

Is your feature request related to a problem? Please describe. In the Aggregation Data Quality expectation, we have the option to add aggregation expression as a quality check. Eg. Sum(quantity) > 100. But we can't have a range expression to this quality rule

Describe the solution you'd like A clear and concise description of what you want to happen. This feature request is to enhance the Aggregation Data quality to accept a range expression . Eg: sum(quantity) between 200 and 10000.

Describe alternatives you've considered We initial had a design of incorporating this based on the <> pattern. Eg 100 < sum(quantity) > 2000. But this is not valid SQL expression. We wanted to have users a consistent pattern of framing rules. So, have come up with a design to incorporate it with a valid SQL expression.

Additional context Add any other context or screenshots about the feature request here.