elementary-data / elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
https://www.elementary-data.com/
Apache License 2.0
1.94k stars 165 forks source link

Partition by dimension in metrics_anomaly_score #1729

Open nescobar opened 4 weeks ago

nescobar commented 4 weeks ago

Describe the bug In metrics_anomaly_score.sql, the metric_value is not partitioned by dimensions when using the dimension properties. This affects the calculation of the anomaly score since it is derived from the average of the metric values across ALL dimensions.

To Reproduce Steps to reproduce the behavior:

In the code below, the metric_value is not being partitioned by dimensions:

avg(metric_value) over (partition by metric_name, full_table_name, column_name order by bucket_start asc rows between unbounded preceding and current row) as training_avg

Expected behavior The average metric_value should be partitioned by dimension_value when dimensions are being used

avg(metric_value) over (partition by metric_name, full_table_name, column_name, dimension_value order by bucket_start asc rows between unbounded preceding and current row) as training_avg