aapatel09 / handson-unsupervised-learning

Code for Hands-on Unsupervised Learning Using Python (O'Reilly Media)
651 stars 324 forks source link

09_semisupervised - LightGBMError: Do not support special JSON characters in feature name. #8

Closed theresadoyon closed 4 years ago

theresadoyon commented 4 years ago

Just trying to execute the code as is and I'm getting the following error: trainingScores = [] cvScores = [] predictionsBasedOnKFolds = pd.DataFrame(data=[], index=y_train.index, \ columns=['prediction'])

for train_index, cv_index in k_fold.split(np.zeros(len(X_train)), \ y_train.ravel()): X_train_fold, X_cv_fold = X_train.iloc[train_index,:], \ X_train.iloc[cv_index,:] y_train_fold, y_cv_fold = y_train.iloc[train_index], \ y_train.iloc[cv_index]

lgb_train = lgb.Dataset(X_train_fold, y_train_fold)
lgb_eval = lgb.Dataset(X_cv_fold, y_cv_fold, reference=lgb_train)
gbm = lgb.train(params_lightGB, lgb_train, num_boost_round=2000,
               valid_sets=lgb_eval, early_stopping_rounds=200)

loglossTraining = log_loss(y_train_fold, gbm.predict(X_train_fold, \
                            num_iteration=gbm.best_iteration))
trainingScores.append(loglossTraining)

predictionsBasedOnKFolds.loc[X_cv_fold.index,'prediction'] = \
    gbm.predict(X_cv_fold, num_iteration=gbm.best_iteration) 
loglossCV = log_loss(y_cv_fold, \
    predictionsBasedOnKFolds.loc[X_cv_fold.index,'prediction'])
cvScores.append(loglossCV)

print('Training Log Loss: ', loglossTraining)
print('CV Log Loss: ', loglossCV)

loglossLightGBMGradientBoosting = log_loss(y_train, \ predictionsBasedOnKFolds.loc[:,'prediction']) print('LightGBM Gradient Boosting Log Loss: ', \ loglossLightGBMGradientBoosting)


LightGBMError Traceback (most recent call last)

in 14 lgb_eval = lgb.Dataset(X_cv_fold, y_cv_fold, reference=lgb_train) 15 gbm = lgb.train(params_lightGB, lgb_train, num_boost_round=2000, ---> 16 valid_sets=lgb_eval, early_stopping_rounds=200) 17 18 loglossTraining = log_loss(y_train_fold, gbm.predict(X_train_fold, \ ~/.local/lib/python3.6/site-packages/lightgbm/engine.py in train(params, train_set, num_boost_round, valid_sets, valid_names, fobj, feval, init_model, feature_name, categorical_feature, early_stopping_rounds, evals_result, verbose_eval, learning_rates, keep_training_booster, callbacks) 226 # construct booster 227 try: --> 228 booster = Booster(params=params, train_set=train_set) 229 if is_valid_contain_train: 230 booster.set_train_data_name(train_data_name) ~/.local/lib/python3.6/site-packages/lightgbm/basic.py in __init__(self, params, train_set, model_file, model_str, silent) 1712 self.handle = ctypes.c_void_p() 1713 _safe_call(_LIB.LGBM_BoosterCreate( -> 1714 train_set.construct().handle, 1715 c_str(params_str), 1716 ctypes.byref(self.handle))) ~/.local/lib/python3.6/site-packages/lightgbm/basic.py in construct(self) 1083 init_score=self.init_score, predictor=self._predictor, 1084 silent=self.silent, feature_name=self.feature_name, -> 1085 categorical_feature=self.categorical_feature, params=self.params) 1086 if self.free_raw_data: 1087 self.data = None ~/.local/lib/python3.6/site-packages/lightgbm/basic.py in _lazy_init(self, data, label, reference, weight, group, init_score, predictor, silent, feature_name, categorical_feature, params) 913 raise TypeError('Wrong predictor type {}'.format(type(predictor).__name__)) 914 # set feature names --> 915 return self.set_feature_name(feature_name) 916 917 def __init_from_np2d(self, mat, params_str, ref_dataset): ~/.local/lib/python3.6/site-packages/lightgbm/basic.py in set_feature_name(self, feature_name) 1366 self.handle, 1367 c_array(ctypes.c_char_p, c_feature_name), -> 1368 ctypes.c_int(len(feature_name)))) 1369 return self 1370 ~/.local/lib/python3.6/site-packages/lightgbm/basic.py in _safe_call(ret) 43 """ 44 if ret != 0: ---> 45 raise LightGBMError(decode_string(_LIB.LGBM_GetLastError())) 46 47 LightGBMError: Do not support special JSON characters in feature name.
aapatel09 commented 4 years ago

Hi, could you please install pip in your notebook using the following, !pip install lightgbm, and try again? Please let me know if you are getting the issue still.