Closed RickPack closed 1 year ago
Try converting the dataset to as.data.frame(economics)
before passing it to the train
function. I think the function does not accept tibble
type.
@jmanacup , thank you, that solved the "location 1 doesn't exist" error. Now I see an XGBoost error.
label must be provided when data is a matrix
I will open another issue.
I cannot provide the dataset that caused this error the first time so using the economics dataset from the ggplot2 package:
Reproducible example
library(forester) library(ggplot2)
test_model <- train( economics %>% select(-date), type = 'regression', y = 'uempmed', engine = c('ranger', 'xgboost', 'decision_tree', 'lightgbm'))
✔ Type guessed as: regression
-------------------- CHECK DATA REPORT --------------------
The dataset has 574 observations and 5 columns, which names are: pce; pop; psavert; uempmed; unemploy;
With the target value described by a column uempmed.
✔ No static columns.
✔ No duplicate columns.
✔ No target values are missing.
✔ No predictor values are missing.
✔ No issues with dimensionality.
✖ Strongly correlated, by Spearman rank, pairs of numerical values are:
pce - pop: 0.99; pce - psavert: -0.79; pop - psavert: -0.84;
✖ These obserwation migth be outliers due to their numerical columns values: 514 515 516 517 518 520 521 522 523 524 525 527 528 529 530 531 ;
✖ Target data is not evenly distributed with quantile bins: 0.24 0.45 0.06 0.26
✔ Columns names suggest that none of them are IDs.
✔ Columns data suggest that none of them are IDs.
-------------------- CHECK DATA REPORT END --------------------
Error in
df[, i]
: ! Can't subset columns past the end. ℹ Location 1 doesn't exist. ℹ There are only 0 columns. Runrlang::last_error()
to see where the error occurred.