ModelOriented / forester

Trees are all you need
https://modeloriented.github.io/forester/
GNU General Public License v3.0
112 stars 15 forks source link

titanic_imputed data: label must be in [0,1] for logistic regression #48

Closed pbiecek closed 1 year ago

pbiecek commented 1 year ago

I have following error

library(DALEX)
library(forester)
output1 <- train(data = titanic_imputed,
                 y = 'survived',
                 bayes_iter = 0,
                 verbose = TRUE,
                 random_iter = 5)

results in

Error in xgb.iter.update(bst$handle, dtrain, iteration - 1, obj) : 
  [14:39:04] amalgamation/../src/objective/regression_obj.cu:138: label must be in [0,1] for logistic regression
Stack trace:
  [bt] (0) 1   xgboost.so                          0x00000001153eeff4 dmlc::LogMessageFatal::~LogMessageFatal() + 116
  [bt] (1) 2   xgboost.so                          0x000000011550ccb4 xgboost::obj::RegLossObj<xgboost::obj::LogisticClassification>::GetGradient(xgboost::HostDeviceVector<float> const&, xgboost::MetaInfo const&, int, xgboost::HostDeviceVector<xgboost::detail::GradientPairInternal<float> >*) + 660
  [bt] (2) 3   xgboost.so                          0x00000001154c5514 xgboost::LearnerImpl::UpdateOneIter(int, std::__1::shared_ptr<xgboost::DMatrix>) + 788
  [bt] (3) 4   xgboost.so                          0x0000000115488f2c XGBoosterUpdateOneIter + 140
  [bt] (4) 5   xgboost.so                          0x00000001153eb8c3 XGBoosterUpdateOneIter_R + 67
  [bt] (5) 6   libR.dylib                          0x000000010b4a5f52 R_doDotCall + 1458
  [bt
In addition: Warning message:
In storage.mode(data) <- "double" : NAs introduced by coercion

and here is traceback

> traceback()
5: xgb.iter.update(bst$handle, dtrain, iteration - 1, obj)
4: xgb.train(params, dtrain, nrounds, watchlist, verbose = verbose, 
       print_every_n = print_every_n, early_stopping_rounds = early_stopping_rounds, 
       maximize = maximize, save_period = save_period, save_name = save_name, 
       xgb_model = xgb_model, callbacks = callbacks, ...)
3: xgboost::xgboost(data$xgboost_data, as.vector(data$ranger_data[[y]] - 
       1), objective = "binary:logistic", nrounds = 20, verbose = 0)
2: train_models(train_data, y, engine, type)
1: train(data = titanic_imputed, y = "survived", bayes_iter = 0, 
       verbose = TRUE, random_iter = 5)

Please let me know if I can use the titanic imputed_data with forester

grudzienAda commented 1 year ago

Now yes, thanks.