Next_Par in BayesianOptimization.R can be repeatedly sampled

yanyachen / rBayesianOptimization

Bayesian Optimization of Hyperparameters

85 stars 21 forks source link

Next_Par in BayesianOptimization.R can be repeatedly sampled #4

Open ahmaurya opened 8 years ago

ahmaurya commented 8 years ago

Hi Yachen,

I used the Bayesian Optimization package for optimizing the hyperparameters in a Kaggle contest. I noticed that the same hyperparameters can be repeatedly sampled, which can be a waste on really large datasets. Perhaps, it would be worthwhile to check if the hyperparameters have been already sampled and tried before? Thanks!

elapsed = 292.07 Round = 15 max.depth = 3.0000 subsample = 0.9000 colsample_bytree = 0.9000 eta = 0.9900 Value = -2.3928 elapsed = 275.87 Round = 16 max.depth = 3.0000 subsample = 0.9000 colsample_bytree = 0.9000 eta = 0.9900 Value = -2.3946 elapsed = 287.28 Round = 17 max.depth = 3.0000 subsample = 0.9000 colsample_bytree = 0.9000 eta = 0.9900 Value = -2.3931

yanyachen commented 8 years ago

I'm guessing this might happen during the "Bayesian Optimization" stage but not during "Random Sampling" Stage. Because every time the algorithm think that point in your hyperparameter space is the point maximizing the utility function, the algorithm will keep choose that point for next round. My implementation will remove duplicated points for gaussian process to run in normal. A quick fix would be use one more round of random sampling when having this issue, but it would not be very elegant.

It would be great if you can share any solution about this. For now, maybe you can try matern 5/2 kernel or use a more aggressive parameter for your utility function. Good luck on Kaggle.

BlindApe commented 8 years ago

Perhaps you could delete from Mat_optim, params combination which are in DT_history, in line: argmax <- as.numeric(Mat_optim[which.min(Negetive_Utility), DT_bounds[, Parameter], with = FALSE]) before take the minimum. Other solution could be randomize the rounding to integer parameters when this happens. Instead of round to nearest integer random round with ceiling or floor.

SimonCoulombe commented 5 years ago

I have the same problem, which package do you all switch to?