mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.64k stars 404 forks source link

Random Forest - Tuning allows Replace to be tuned, but cannot be set using setHyperPars #675

Closed myloginid closed 8 years ago

myloginid commented 8 years ago

Hi, A small bug in RF Implementation. RF "replace" parameter can be tuned using makeParamSet. But when we set it using setHyperPars it gives the below error.

Code - ps = makeParamSet( makeDiscreteParam(id = "ntree", values = c ( 500, 600, 750 )), makeDiscreteParam(id = "se.method", values = c("bootstrap", "jackknife", "noisy.bootstrap") ), makeDiscreteParam(id = "replace", values = c ( 0,1 ) ), makeDiscreteParam(id = "nodesize", values = c ( 30,60,90 )) ) rdesc = makeResampleDesc("CV", iters = ccvv) ctrl = makeTuneControlRandom(maxit = mit) res = tuneParams(lrn, task = trainTask, resampling = rdesc, par.set = ps, control = ctrl) lrn = setHyperPars(lrn, par.vals = res$x)

Error and Traceback - Starting parallelization in mode=socket with cpus=3. Exporting objects to slaves for mode socket: SQWK,SQWKfun [Tune] Started tuning learner regr.randomForest.preproc for parameter set: Type len Def Constr Req Tunable ntree discrete - - 500,600,750 - TRUE se.method discrete - - bootstrap,jackknife,noisy.bootstrap - TRUE replace discrete - - 0,1 - TRUE nodesize discrete - - 30,60,90 - TRUE Trafo ntree - se.method - replace - nodesize - With control class: TuneControlRandom Imputation value: -0 Exporting objects to slaves for mode socket: .mlr.slave.options Mapping in parallel: mode = socket; cpus = 3; elements = 5. [Tune] Result: ntree=750; se.method=noisy.bootstrap; replace=1; nodesize=90 : SQWK.test.mean= NA Error in setHyperPars2.Learner(learner$next.learner, par.vals = par.vals[i]) : 1 is not feasible for parameter 'replace'!

traceback() 7: stop(msg) 6: setHyperPars2.Learner(learner$next.learner, par.vals = par.vals[i]) 5: setHyperPars2(learner$next.learner, par.vals = par.vals[i]) 4: setHyperPars2.BaseWrapper(learner, insert(par.vals, args)) 3: setHyperPars2(learner, insert(par.vals, args)) 2: setHyperPars(lrn, par.vals = res$x) at #47 1: mlr.randomForest(train, test)

Thanks, Manish

larskotthoff commented 8 years ago

replace is a Boolean parameter, so you can only set it to TRUE and FALSE, not 0 and 1.

berndbischl commented 8 years ago

A few comments to help you

1) First of all the tuning does not work with the wrong "replace" parameter. I get this:

lrn = makeLearner("regr.randomForest")

ps = makeParamSet(
  makeDiscreteParam(id = "ntree", values = c ( 500, 600, 750 )),
  # makeLogicalParam(id = "replace"),
  makeDiscreteParam(id = "replace", values = c(0, 1)),
  makeDiscreteParam(id = "nodesize", values = c ( 30,60,90 ))
)
ctrl = makeTuneControlRandom(maxit = 3L)
res = tuneParams(lrn, task = bh.task, resampling = hout, par.set = ps, control = ctrl)
lrn = setHyperPars(lrn, par.vals = res$x)
[Tune-y] 3: mse.test.mean=18.8; time: 0.0 min; memory: 84Mb use, 147Mb max
[Tune] Result: ntree=750; replace=TRUE; nodesize=60 : mse.test.mean=18.8
> source("test.R")
Loading mlr
[Tune] Started tuning learner regr.randomForest for parameter set:
             Type len Def      Constr Req Tunable Trafo
ntree    discrete   -   - 500,600,750   -    TRUE     -
replace  discrete   -   -         0,1   -    TRUE     -
nodesize discrete   -   -    30,60,90   -    TRUE     -
With control class: TuneControlRandom
Imputation value: Inf
[Tune-x] Setting hyperpars failed: Error in setHyperPars2.Learner(learner, insert(par.vals, args)) : 
  0 is not feasible for parameter 'replace'!

[Tune-x] 1: ntree=750; replace=0; nodesize=30
[Tune-y] 1: mse.test.mean=  NA; time: 0.0 min; memory: 84Mb use, 147Mb max
[Tune-x] Setting hyperpars failed: Error in setHyperPars2.Learner(learner, insert(par.vals, args)) : 
  1 is not feasible for parameter 'replace'!

[Tune-x] 2: ntree=600; replace=1; nodesize=60
[Tune-y] 2: mse.test.mean=  NA; time: 0.0 min; memory: 84Mb use, 147Mb max
[Tune-x] Setting hyperpars failed: Error in setHyperPars2.Learner(learner, insert(par.vals, args)) : 
  0 is not feasible for parameter 'replace'!

[Tune-x] 3: ntree=750; replace=0; nodesize=30
[Tune-y] 3: mse.test.mean=  NA; time: 0.0 min; memory: 84Mb use, 147Mb max
[Tune] Result: ntree=600; replace=1; nodesize=60 : mse.test.mean=  NA
Error in setHyperPars2.Learner(learner, insert(par.vals, args)) : 
  1 is not feasible for parameter 'replace'!

2) From your code I cannot guess what you wanna do exactly, but are you really sure that "se.mthod" can influence your results? do you really use the se estimation at all?

3) For random search to work, you do not need to discretize everything. you do know this right?

4) This then runs:

lrn = makeLearner("regr.randomForest")

ps = makeParamSet(
  makeDiscreteParam(id = "ntree", values = c ( 500, 600, 750 )),
  makeLogicalParam(id = "replace"),
  makeDiscreteParam(id = "nodesize", values = c ( 30,60,90 ))
)
ctrl = makeTuneControlRandom(maxit = 3L)
res = tuneParams(lrn, task = bh.task, resampling = hout, par.set = ps, control = ctrl)
lrn = setHyperPars(lrn, par.vals = res$x)

5) (Note how I provided a fully reproducible example....)

I will close this here, as this is not a bug. But feel free to ask if you need more info.

myloginid commented 8 years ago

Hey..

1 - I understood the mistake of using Discrete Param instead of Logical Param for replace which is a logical value. makeDiscreteParam(id = "replace", values = c(0, 1)), --- Wrong makeLogicalParam(id = "replace" , values = c(0, 1)), --- Correct

2 - For random search to work, you do not need to discretize everything. you do know this right? -- I did not know this. I was either using ranges or fixed values for trees. Now I know that I can keep it empty too.

3 - se.method - I only know that this does resampling. However the randomForest documentation does not have this listed as a parameter. I didnt check the mlr code base to check out how it works. I simply kept it just to see if it gave me a better model with a different resampling. (I am guilty of ignorance)

Thanks for the help..