Closed chaotianshinaida closed 10 months ago
Hi @chaotianshinaida
Is this the code you tried? https://github.com/Team-TUD/CTAB-GAN-Plus/blob/main/Experiment_Script_king.ipynb
If you don't need differential privacy, please try the code in https://github.com/Team-TUD/CTAB-GAN-Plus repo
Best,
Zilong
yes,let me try anain
I rUn your code to deal with a regression problem.Like this:synthesizer = CTABGAN(raw_csv_path = real_path, test_ratio = 0.20, categorical_columns = ['bedrooms', "floors", 'waterfront', 'view', 'condition', 'grade','zipcode'],
log_columns = [], mixed_columns= {"sqft_basement":[0.0], "yr_renovated":[0.0]}, general_columns= ["bathrooms", "sqft_living", "sqft_above", "yr_built", "long", "sqft_living15"], non_categorical_columns= [], integer_columns = [], problem_type= {"Regression": "price"})
but the code report erroe: Traceback (most recent call last): File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\utils_param_validation.py", line 214, in wrapper return func(*args, **kwargs) File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 2670, in train_test_split train, test = next(cv.split(X=arrays[0], y=stratify)) File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 1746, in split for train, test in self._iter_indices(X, y, groups): File "D:\biancheng\Anaconda\anaconda\envs\ctabganplus\lib\site-packages\sklearn\model_selection_split.py", line 2147, in _iter_indices raise ValueError( ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
It seems to be still dealing with classification problems rather than regression. Error display that y only have one label. AND i searched in the model and couldn't find the part used to solve the regression problem.But it's mentioned in the paper that CTAB-GAN+ can deal with regression problem.