ModelOriented / forester

Trees are all you need
https://modeloriented.github.io/forester/
GNU General Public License v3.0
108 stars 14 forks source link

train.R function does not work with engine = "xgboost" #97

Closed jmanacup closed 1 year ago

jmanacup commented 1 year ago

This is aligned with #96 where specifying xgboost only as the engine throws an error.

I believe this is because in train_models.R,

else if (engine[i] == 'xgboost') {
      if (type == 'binary_clf') {
        if (any(data$ranger_data[[y]] == 2)) {
          data$ranger_data[[y]] = data$ranger_data[[y]] - 1
        }
      xgboost_model <-
        xgboost::xgboost(data$xgboost_data,
                         as.vector(data$ranger_data[[y]]),
                         objective = 'binary:logistic',
                         nrounds = 20,
                         verbose = 0,
                         eval_metric = 'auc')
      } else if (type == 'regression'){
        xgboost_model <-
          xgboost::xgboost(data$xgboost_data,
                           as.vector(data$ranger_data[[y]]),
                           nrounds = 20,
                           verbose = 0)
      }

    }

it uses as.vector(data$ranger_data[[y]]) which is null when it is not specified as one of the type of the engine argument. My suggestion would be is to create another parameter for train_model function or put it in the list outputted by prepare_data for train.

HubertR21 commented 1 year ago

Your instincts are correct, and the issue was already addressed in the development of the next version of the package!