open-metadata / OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
https://open-metadata.org
Apache License 2.0
5.55k stars 1.05k forks source link

RowCount / SumCheck Tests based on Historical Averages / St.Devs. #10532

Closed DovileKr closed 1 year ago

DovileKr commented 1 year ago

Is your feature request related to a problem? Please describe. Current Table Tests are somewhat limited in their context and ability to provide value. For example:

Describe the solution you'd like I do not know how, but ideally table metrics would be tracking daily changes on RowCount or CheckSum (for pre-selected columns), deriving averages. Then allowing to choose threshold (like -/+ 20%, or ideally St.Dev. with confidence level). And then setting this as dynamic Min/Max values for RowCounts or Column Sums.

This would be a very powerful data validation test, where even before loading, data pipeline could double check with OMD - "is my value" within expected ranges.

Describe alternatives you've considered

Additional context Add any other context or screenshots about the feature request here.

TeddyCr commented 1 year ago

Hey @DovileKr, thank you for suggesting this feature. This is actually already in the roadmap but for Collate SaaS/OnPrem (and not the Open Source version) as part of our anomaly detection feature. I'll mark this issue as won't fix for now.