@Szmajasz hi Szymon, from line 66 to 126 in function make_catboost.R
` # Creating validation set in ratio 4:1
splited_data <- split_data(data, target, type)
data <- splited_data[[1]]
data_val <- splited_data[[2]]
}`
We used Bayesian Optimization to find most optimal tuple of hyperparameters. But I found that, we split the data into data and data_val, after finding optimal HP, we should train the model again on the original data_train to prevent data loss. I just think of combining those two structures: cat_data and cat_data_val from your code, but I don't know specifically whether it would be fine. It's much better if we can combine those two cat_data and cat_data_val instead of creating new variable.
Thank you. We have made major changes to the forester package. The previous version of the package is available on the old branch. It will not be supported, we encourage you to use the new one.
@Szmajasz hi Szymon, from line 66 to 126 in function make_catboost.R ` # Creating validation set in ratio 4:1 splited_data <- split_data(data, target, type) data <- splited_data[[1]] data_val <- splited_data[[2]]
}` We used Bayesian Optimization to find most optimal tuple of hyperparameters. But I found that, we split the data into data and data_val, after finding optimal HP, we should train the model again on the original data_train to prevent data loss. I just think of combining those two structures: cat_data and cat_data_val from your code, but I don't know specifically whether it would be fine. It's much better if we can combine those two cat_data and cat_data_val instead of creating new variable.