Closed LinoMMV closed 6 months ago
Hi @LinoMMV
For "The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2" I think in your generated data, you have too few minority class.
For "Input contains NaN, infinity or a value too large for dtype('float64')" did you check if your data contain NaA? I am actually not sure if dataset with missing values can calculate the pairwise correlation or not, all my experiments are conducted on dataset without missing values. Seems Support2 dataset contains quite a lot missing values.
So what I mean here is that our model can synthesize data with missing values, but our evaluation pipeline may not be tested with dataset containing missing values.
Thank you for the response. I suppose I'll stick with the statistic evaluation then.
I was trying to generate data for the Support2 dataset which worked fine but I am consistently getting a "The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2" error while trying to do the classification evaluation. The issue can be traced back to line 77 in evaluation.py. Presumably the stratification.
I am also getting a "Input contains NaN, infinity or a value too large for dtype('float64')" error with privacy metrics, specifically while trying to calculate pairwise correlation in line 197.
I had no issues with the example datasets. I also tried a few variations with setting the columns including not using any categorical ones.