sdv-dev / SDMetrics

Metrics to evaluate quality and efficacy of synthetic datasets.
https://docs.sdv.dev/sdmetrics
MIT License
201 stars 45 forks source link

Update get_cardinality_plot to show missing values as subtitle #586

Closed gsheni closed 2 months ago

gsheni commented 2 months ago

Problem Description

Expected behavior

Current plot Example with NaN foreign Keys

real_data['child']['parent_id'] = pd.Series([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, nan, nan, nan])
synthetic_data['child']['parent_id'] = pd.Series([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, nan, nan, nan])
get_cardinality_plot(data, synthetic_data, child_table_name='child', parent_table_name='parent',
                child_foreign_key='parent_id', metadata=metadata)

fig1