opendp / smartnoise-sdk

Tools and service for differentially private processing of tabular and relational data
MIT License
254 stars 68 forks source link

MST/AIM output can include None values #591

Open madprogramer opened 9 months ago

madprogramer commented 9 months ago

Consider a dataset dat with only 3 binary variables (encoded using the LabelTransformer)

A B C
1 0 1
0 0 1
1 1 1
1 1 1

When I call synth.fit_sample(dat, transformer=tableTransformer) I seemingly get no errors for MST/AIM synthesizers. However, upon inspecting the input it looks a little something like this:

A B C
1 1 1
1 1 None
None None 1
0 0 1

I don't suppose there is an easy fix for this, besides increasing sample size/epsilon. But at the very least I think fit_sample should generate a warning when None values are present.