Open GerardYu opened 1 year ago
So i discovered that the error disappears if eval_metric = "mape"
instead of eval_metric = "mae"
This error means that all of the outputs from the tuning function have the same value. This causes singularity issues when trying to train a Gaussian Process.
I am running a binary classification problem and getting a similar issue. From my understanding the error is coming from the zeroOneScale function:
Scale a vector between 0-1
zeroOneScale <- function(vec) {
r <- max(vec) - min(vec)
# If the scoring function returned the same results
# this results in the function a vector of 1s.
if(r==0) stop("Results from FUN have 0 variance, cannot build GP.")
vec <- (vec - min(vec))/r
return(vec)
I think this is trying to scale my binary data which I do not want to happen.
Similar to OP this problem goes away when I use 'auc' as my evaluation metric but my dataset is highly skewed and I want to test with different metrics to see if this affects the tuning.
Heres my code for refernece
obj_func <- function(eta, max_depth, min_child_weight, subsample, lambda, alpha, nfolds) {
dtrain <- xgb.DMatrix(data = as.matrix(train_data[, -3]), label = train_data$hidden_hypoxemia, missing = NA)
param <- list(
eta = eta,
max_depth = max_depth,
min_child_weight = min_child_weight,
subsample = subsample,
lambda = lambda,
alpha = alpha,
# Tree model
booster = "gbtree",
objective = "binary:logistic",
eval_metric = "logloss"
)
xgbcv <- xgb.cv(params = param,
data = dtrain,
nround = 50,
nfold = nfolds,
prediction = TRUE,
early_stopping_rounds = 10,
verbose = 2,
maximize = TRUE,
stratified = TRUE
)
lst <- list(
# First argument must be named as "Score"
# Function finds maxima so inverting the output
Score = suppressWarnings(min(xgbcv$evaluation_log$test_auc_mean)),
# Get number of trees for the best performing model
nrounds = xgbcv$best_iteration
)
return(lst)
param_bounds <- list(eta = c(0.001, 0.15),
max_depth = c(1L, 10L),
min_child_weight = c(1, 50),
subsample = c(0.1, 1),
lambda = c(1, 10),
alpha = c(1,10),
nfold = c(3L, 10L))
}
bayes_out <- bayesOpt(FUN = obj_func, bounds = param_bounds, initPoints = length(param_bounds) + 2, iters.n = 3)
Any help with this would be greatly appreciated
Hi i got the above error when using the bayesOpt() function. I've checked that both the data and labels are valid.
here's the full ouput
and here's the code that i've run