sdv-dev / SDGym

Benchmarking synthetic data generation methods.
Other
256 stars 58 forks source link

Allow the ability to compute diagnostic score in a benchmarking run #311

Closed npatki closed 3 months ago

npatki commented 3 months ago

Problem Description

The SDGym benchmark script currently allows users to run a quality report for each dataset/synthesizer pair. But it does not allow them to run the diagnostic report.

The diagnostic report (as implemented in SDMetrics) checks for overall data validity between the real and synthetic data. For the health of a synthesizer, it is important to verify that the synthetic data it produces always has a score of 1.0.

Expected behavior

Add a parameter called compute_diagnostic_score to the benchmarking script.

Additional context

When run, the time it takes to compute the diagnostic report should be included in the overall Evaluate Time.