Open sibipx opened 3 months ago
Hi! Are there any thoughts on this? I see it tagged as question, so should I use a different platform for asking this? Any suggestions are useful! Thanks!
Not yet, @david-cortes is working on refactoring the R interface. Wondering whether it's reproducible there. Will test it.
The new interface hasn't changed anything in the logic for how early stopping works.
The sample code doesn't work anymore in the current master branch, but I gave it a try with a much smaller example and didn't observe any difference - both the built-in and the custom metric stop at the same iteration.
@sibipx Are you able to reproduce the issue with the current master branch? Note that the tidyverse dependencies you are using do not currently work with the latest XGBoost so you'll need to modify the code snippet.
library(ModelMetrics)
library(xgboost)
data(iris)
dm <- xgb.DMatrix(iris[,-5], label=as.numeric(iris$Species)-1)
logloss_m_obj <- function(preds, dtrain) {
labels <- getinfo(dtrain, "label")
labels <- factor(labels)
# preds should be a matrix
preds <- matrix(preds, ncol = n_classes, byrow = TRUE)
m_logloss <- ModelMetrics::mlogLoss(labels, preds)
#m_logloss <- yardstick::mn_log_loss_vec(labels, preds) # gives same results
return(list(metric = "m_logloss", value = m_logloss))
}
set.seed(123)
cv <- xgb.cv(params = list(booster = "gbtree",
eta = 0.1,
objective = "multi:softprob",
feval = logloss_m_obj,
num_class = 3),
data = dm,
nround = 10000, # Set this large and use early stopping
nfold = 5,
showsd = FALSE,
early_stopping_rounds = 25,
maximize = FALSE,
verbose = 1)
set.seed(123)
cv2 <- xgb.cv(params = list(booster = "gbtree",
eta = 0.1,
objective = "multi:softprob",
eval_metric = "mlogloss",
num_class = 3),
data = dm,
nround = 10000, # Set this large and use early stopping
nfold = 5,
showsd = FALSE,
early_stopping_rounds = 25,
maximize = FALSE,
verbose = 1)
Hi! I am cross-validating a multinomial classification model using early stopping and I observe that when I provide custom eval_metric it stops earlier.
See below a dummy example on iris, in which the custom metric is mlogloss. When I calculate the mlogloss through my custom function, it stops earlier (mean cv$best_iteration is 6 over 100 runs). When I use eval_metric = "mlogloss" it stops later (mean cv$best_iteration is 86 over 100 runs). There is no real difference in test performance on the iris dataset (if I train with the "winning" nround), but there is a serious difference on my dataset.
Of course, ultimately this is not what I really want to do. I want fit a model using multinomial objective but cross-validate using binary logloss for my class of interest. And in that case, I get exactly the same behaviour: when I provide custom eval_metric, it stops very early.
Any suggestions or explanation on what is happening are welcome. Thanks!
Session info: