sdv-dev / SDMetrics

Metrics to evaluate quality and efficacy of synthetic datasets.
https://docs.sdv.dev/sdmetrics
MIT License
206 stars 44 forks source link

Incorrect score interpretation for the Detection Metrics #399

Closed npatki closed 1 year ago

npatki commented 1 year ago

Currently, the docs for the detection metrics (for both single table and sequential) mention that the final score is 1 - avg(ROC AUC). The also provide an interpretation for the score, assuming it's 1 - avg(ROC AUC):

image

However, in the codebase, we see a slightly different formula used for computing the score. It's more accurate to say that:

score = 1 - [MAX(0.5, roc auc)*2 - 1]

Which means:

npatki commented 1 year ago

This has now been updated in the page linked above. (May require a page refresh depending on your cache settings!)