ja-thomas / autoxgboost

autoxgboost - Automatic tuning and fitting of xgboost
Other
120 stars 19 forks source link

chol.default() error #59

Open ck37 opened 5 years ago

ck37 commented 5 years ago

Hello,

Thanks for making autoxgboost, seems like a great effort. I wanted to report an error I just received while testing it out:

Error in chol.default(R) : 
  the leading minor of order 2 is not positive definite

This occurred after a fair number of iterations had run:

[mbo] 406: eta=0.0605; gamma=0.26; max_depth=3; colsample_bytree=0.627; colsample_bylevel=0.695; lambda=803; alpha=0.0483; subsample=0.553 : y = 6.75 : 2.0 secs : infill_cb
[mbo] 407: eta=0.0235; gamma=5.63; max_depth=15; colsample_bytree=0.502; colsample_bylevel=0.841; lambda=741; alpha=0.000985; subsample=0.567 : y = 6.75 : 17.7 secs : infill_cb

Here was my code that yielded the error:

mlr_data = data.frame(y = train_y, train_x) 
task = makeRegrTask(data = mlr_data, target = "y")
ctrl = makeMBOControl()
ctrl = setMBOControlTermination(ctrl, iters = 3000L)
system.time({
  (res = autoxgboost(task, control = ctrl, tune.threshold = FALSE,
                  # 3 hours.
                  time.budget = 3600L * 3,
                  early.stopping.fraction = 0.4))
})
res

This was on windows using R 3.5.1 and xgboost 0.71.2. My dataset is 4,861 observations with 104 variables, and the outcome scale is 1- 9. A few of the outcome values are rare so I suspect that it could have caused the error.

Thanks, Chris

ja-thomas commented 5 years ago

Hi, this error occurs in the training of the Gaussian process surrogate model. This can happen when a large number of evaluated configurations yield the same result.

Can you

1) Check how much the evaluated y's vary in the optimization trace? 2) Try to rerun autoxgboost with mbo.learner = makeLearner("regr.randomForest", predict.type="se")