Team-TUD / CTAB-GAN-Plus-DP

20 stars 4 forks source link

question about CTAB-GAN+ #1

Closed chaotianshinaida closed 10 months ago

chaotianshinaida commented 11 months ago

I rUn your code to deal with a regression problem.Like this:synthesizer = CTABGAN(raw_csv_path = real_path, test_ratio = 0.20, categorical_columns = ['bedrooms', "floors", 'waterfront', 'view', 'condition', 'grade','zipcode'],
log_columns = [], mixed_columns= {"sqft_basement":[0.0], "yr_renovated":[0.0]}, general_columns= ["bathrooms", "sqft_living", "sqft_above", "yr_built", "long", "sqft_living15"], non_categorical_columns= [], integer_columns = [], problem_type= {"Regression": "price"})

but the code report erroe: Traceback (most recent call last): File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\utils_param_validation.py", line 214, in wrapper return func(*args, **kwargs) File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 2670, in train_test_split train, test = next(cv.split(X=arrays[0], y=stratify)) File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 1746, in split for train, test in self._iter_indices(X, y, groups): File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 2147, in _iter_indices raise ValueError( ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

It seems to be still dealing with classification problems rather than regression. Error display that y only have one label. AND i searched in the model and couldn't find the part used to solve the regression problem.But it's mentioned in the paper that CTAB-GAN+ can deal with regression problem. problem

zhao-zilong commented 11 months ago

Hi @chaotianshinaida

Is this the code you tried? https://github.com/Team-TUD/CTAB-GAN-Plus/blob/main/Experiment_Script_king.ipynb

If you don't need differential privacy, please try the code in https://github.com/Team-TUD/CTAB-GAN-Plus repo

Best,

Zilong

chaotianshinaida commented 11 months ago

yes,let me try anain