sdv-dev / SDMetrics

Metrics to evaluate quality and efficacy of synthetic datasets.
https://docs.sdv.dev/sdmetrics
MIT License
210 stars 45 forks source link

Overall property score should be the average across all breakdowns #415

Closed npatki closed 1 year ago

npatki commented 1 year ago

Environment Details

Error Description

In the current development version, the overall property score is not being computed properly

Steps to reproduce

from sdmetrics import load_demo 
from sdmetrics.reports.multi_table import QualityReport

real_data, synthetic_data, metadata = load_demo(modality='multi_table')
report = QualityReport()

Observe that the Column Shapes property score is reported as 79.68%. However, the average of Column Shapes is actually 79.23% -- we expect this number.

all_shapes = report.get_details('Column Shapes')
print(all_shapes['Score'].agg('mean'))