AnotherSamWilson / ParBayesianOptimization

Parallelizable Bayesian Optimization in R
107 stars 18 forks source link

Non-reproducible issue #7

Closed fck1984 closed 4 years ago

fck1984 commented 4 years ago

HI Sam: Nice work! I really appreciate it. But I have a non-reproducible problem. I ran my code twice,and set the seed,but the output are a little different. I mean,is the output always different,or I did something wrong. code as follow:

lgb_cv_bayes <- function(feature_fraction,lambda_l1, lambda_l2) {
     cv <- lgb.cv(nrounds=12000,
                        learning_rate=0.1,
                        verbose=-1,
                        feature_fraction=feature_fraction,
                        lambda_l1=lambda_l1,
                        lambda_l2=lambda_l2,
                        objective = "regression",
                        early_stopping_rounds = 10,
                        metric = "mse",
                        max_depth = -1,
                        num_leaves = 15,
                        force_col_wise=TRUE,
                        folds=folds,
                        data = dtrain,
                        reset_data=TRUE, 
                        seed=1000)
     return(list(Score = -min(unlist(cv$record_evals$valid$l2$eval)^2 +unlist(cv$record_evals$valid$l2$eval_err)^2)))
}
set.seed(1000)
OPT_Res <- bayesOpt(FUN=lgb_cv_bayes, 
                    bounds = list(feature_fraction = c(0.1, 1),
                                  lambda_l1=c(0,600),
                                  lambda_l2=c(0,600)),
                    initPoints = 4, 
                    iters.n = 5,
                    kappa = 1.96, 
                    acq = "ucb", 
                    eps = 0.0,
                    verbose = 1)

rm(.Random.seed)

and the output:

   Epoch Iteration feature_fraction lambda_l1 lambda_l2 gpUtility acqOptimum
1:     0         1        0.7311136  251.4830 468.33184        NA      FALSE
2:     0         2        0.8401650  367.7844 385.12763        NA      FALSE
3:     0         3        0.1161490  123.6188 157.45649        NA      FALSE
4:     0         4        0.4483687  463.6152  84.24306        NA      FALSE
5:     1         5        0.8842224    0.0000 600.00000 0.5508117       TRUE
6:     2         6        1.0000000    0.0000   0.00000 0.3758503       TRUE
7:     3         7        1.0000000  600.0000 600.00000 0.4037062       TRUE
8:     4         8        0.1000000  600.0000 322.75344 0.3586758       TRUE
9:     5         9        1.0000000    0.0000 465.22698 0.3226101       TRUE
   inBounds Elapsed      Score errorMessage
1:     TRUE 159.439 -0.9793565           NA
2:     TRUE 270.492 -0.9793492           NA
3:     TRUE  62.071 -0.9794464           NA
4:     TRUE 139.777 -0.9800693           NA
5:     TRUE 154.130 -0.9787648           NA
6:     TRUE  71.743 -0.9830193           NA
7:     TRUE 283.040 -0.9801484           NA
8:     TRUE  76.871 -0.9818072           NA
9:     TRUE 116.147 -0.9791738           NA

   Epoch Iteration feature_fraction lambda_l1 lambda_l2 gpUtility acqOptimum
1:     0         1        0.7311136  251.4830 468.33184        NA      FALSE
2:     0         2        0.8401650  367.7844 385.12763        NA      FALSE
3:     0         3        0.1161490  123.6188 157.45649        NA      FALSE
4:     0         4        0.4483687  463.6152  84.24306        NA      FALSE
5:     1         5        0.8842242    0.0000 600.00000 0.5507994       TRUE
6:     2         6        1.0000000    0.0000   0.00000 0.3758331       TRUE
7:     3         7        1.0000000  600.0000 600.00000 0.4037421       TRUE
8:     4         8        0.1000000  600.0000 322.75015 0.3587458       TRUE
9:     5         9        1.0000000    0.0000 465.21572 0.3226112       TRUE
   inBounds Elapsed      Score errorMessage
1:     TRUE 156.162 -0.9793565           NA
2:     TRUE 272.751 -0.9793492           NA
3:     TRUE  59.700 -0.9794464           NA
4:     TRUE 145.370 -0.9800692           NA
5:     TRUE 161.409 -0.9787650           NA
6:     TRUE  73.514 -0.9830193           NA
7:     TRUE 284.599 -0.9801488           NA
8:     TRUE  75.613 -0.9818066           NA
9:     TRUE 118.860 -0.9792093           NA
AnotherSamWilson commented 4 years ago

Hmmm I don't get any issues when setting the seed like this. Just to be sure, you are setting the seed the same way before each run, correct?

fck1984 commented 4 years ago

ye,every round~

AnotherSamWilson commented 4 years ago

Can you show the output from sessionInfo(). Also, can you confirm that the following returns TRUE:

library(ParBayesianOptimization)

sf <- function(x) 100 - x^2
FUN <- function(x) {
  return(list(Score = sf(x)))
}
bounds = list(
  x = c(-2,2)
)

set.seed(1991)
optObj <- bayesOpt(
  FUN
  , bounds
  , initPoints = 4
  , iters.n = 2
  , verbose = 0
)

set.seed(1991)
optObj2 <- bayesOpt(
  FUN
  , bounds
  , initPoints = 4
  , iters.n = 2
  , verbose = 0
)

identical(optObj$scoreSummary,optObj2$scoreSummary)
fck1984 commented 4 years ago

identical(optObj$scoreSummary,optObj2$scoreSummary) [1] FALSE

fck1984 commented 4 years ago

output seems identical to me `> optObj$scoreSummary Epoch Iteration x gpUtility acqOptimum inBounds Elapsed Score 1: 0 1 0.6633560 NA FALSE TRUE 0.001 99.55996 2: 0 2 1.3760146 NA FALSE TRUE 0.005 98.10658 3: 0 3 -0.9862286 NA FALSE TRUE 0.000 99.02735 4: 0 4 -1.4074789 NA FALSE TRUE 0.000 98.01900 5: 1 5 0.6520079 0.6474753 TRUE TRUE 0.000 99.57489 6: 2 6 0.2285763 0.6273672 TRUE TRUE 0.000 99.94775 errorMessage 1: NA 2: NA 3: NA 4: NA 5: NA 6: NA

optObj2$scoreSummary Epoch Iteration x gpUtility acqOptimum inBounds Elapsed Score 1: 0 1 0.6633560 NA FALSE TRUE 0.000 99.55996 2: 0 2 1.3760146 NA FALSE TRUE 0.001 98.10658 3: 0 3 -0.9862286 NA FALSE TRUE 0.000 99.02735 4: 0 4 -1.4074789 NA FALSE TRUE 0.000 98.01900 5: 1 5 0.6520079 0.6474753 TRUE TRUE 0.000 99.57489 6: 2 6 0.2285763 0.6273672 TRUE TRUE 0.000 99.94775 errorMessage 1: NA 2: NA 3: NA 4: NA 5: NA 6: NA`