Closed phisanti closed 2 years ago
Two things you need to keep in mind.
Usually, you want to fit on the complete data set and then predict on new data. Do you still want to extract these models? I have to admit that extracting these models is not straightforward but it is possible.
In the single-crit case, this is covered by the AutoTuner
. Unfortunately, we cannot provide an AutoTuner
for multi-crit optimization.
My goal is to manually implement some kind of "nested cross validation" with custom sampling. I have built a function to get "combinatorial purged k-folds" (see this article) that will run within the inner loop and then go for an out of sample test. I found that I could get it via:
resamples_res <- instance$archive$resample_result(n)
min_res <- resamples_res$score(msr("classif.bacc")) %$% which.max(classif.bacc)
model <- resamples_res$learners[[min_res]]
model$train(resamples_res$task)
However, I still see a huge disparity between CV accuracy and out-of-sample test, which makes me think my implementation might not reflect the description of tune_nested() as in the mlr3 book.
Are you doing single-crit or multi-crit optimization?
Currently, I run a for loop with a Multi-Crit optim, but to select the best model I just use one metrict
I am unsure what you want to achieve. Your code selects the model with the highest accuracy from a single ResampleResult
. What value do you choose for n
? In your first example, a holdout validation is used i.e. there is only one model in the ResampleResult
.
When you want to select the best hyperparameter configuration based on one metric from a multi-crit archive, you do it like this.
as.data.table(instance$archive)[which.min(classif.ce), ]
The data.table
contains a resample result and you can extract the model from it.
Please reopen if you have further questions.
Following this example:
Is there a way to extract the best model from the instance and predict directly on a new dataset?