awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.32k stars 539 forks source link

Using metrics repository from MySQL #402

Closed asktushar closed 2 years ago

asktushar commented 2 years ago

Hi

I understand we can use metrics repository from filesystem using the below -

metrics_file = FileSystemMetricsRepository.helper_metrics_file(spark, 'metrics.json')
repository = FileSystemMetricsRepository(spark, metrics_file)

But in cases of Anomaly Detection, we should be using repository from the mysql (mysql has my history of metrics) I couldn't find a way to create repository using mysql or JSON Data (which i can get from my database), Is there a functionality already available for the same?

Thanks, Tushar