mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.64k stars 404 forks source link

Unable to makeModelMultiplexer with makeRemoveConstantFeaturesWrapper #2544

Closed idavydov closed 5 years ago

idavydov commented 5 years ago

I'm trying to use makeModelMultiplexer with makeRemoveConstantFeaturesWrapper. And I get unexpected errors. I tried two different approaches:

library(mlr)
#> Loading required package: ParamHelpers

# first approach
base.learners = list(
  makeLearner('classif.randomForestSRC')
)
lrn <- makeModelMultiplexer(base.learners)
lrn <- makeRemoveConstantFeaturesWrapper(lrn)

ps <- makeModelMultiplexerParamSet(lrn,
  makeDiscreteParam('ntree', values=c(100, 500))
)
#> Error in makeModelMultiplexerParamSet(lrn, makeDiscreteParam("ntree", : Assertion on 'multiplexer' failed: Must inherit from class 'ModelMultiplexer', but has classes 'RemoveConstantFeaturesWrapper','PreprocWrapper','BaseWrapper','Learner'.

# second approach
base.learners = list(
  makeRemoveConstantFeaturesWrapper(makeLearner('classif.randomForestSRC'))
)
lrn <- makeModelMultiplexer(base.learners)
ps <- makeModelMultiplexerParamSet(lrn,
  makeDiscreteParam('ntree', values=c(100, 500))
)
#> Error in makeModelMultiplexerParamSet(lrn, makeDiscreteParam("ntree", : No param of id 'ntree' in base learner 'classif.randomForestSRC.preproc'!

# this works (but no wrapper!)
base.learners = list(
  makeLearner('classif.randomForestSRC')
)
lrn <- makeModelMultiplexer(base.learners)
ps <- makeModelMultiplexerParamSet(lrn,
  makeDiscreteParam('ntree', values=c(100, 500))
)

ctrl <- makeTuneControlGrid()

r <- makeResampleDesc("CV", iters = 3)

tuneParams(lrn, task=sonar.task, resampling=r,
           par.set = ps, control = ctrl)
#> [Tune] Started tuning learner ModelMultiplexer for parameter set:
#>                                   Type len Def                  Constr Req
#> selected.learner              discrete   -   - classif.randomForestSRC   -
#> classif.randomForestSRC.ntree discrete   -   -                 100,500   Y
#>                               Tunable Trafo
#> selected.learner                 TRUE     -
#> classif.randomForestSRC.ntree    TRUE     -
#> With control class: TuneControlGrid
#> Imputation value: 1
#> [Tune-x] 1: selected.learner=classif.rand...; ntree=100
#> [Tune-y] 1: mmce.test.mean=0.1873706; time: 0.0 min
#> [Tune-x] 2: selected.learner=classif.rand...; ntree=500
#> [Tune-y] 2: mmce.test.mean=0.1680469; time: 0.0 min
#> [Tune] Result: selected.learner=classif.rand...; classif.randomForestSRC.ntree=500 : mmce.test.mean=0.1680469
#> Tune result:
#> Op. pars: selected.learner=classif.rand...; classif.randomForestSRC.ntree=500
#> mmce.test.mean=0.1680469

sessionInfo()
#> R version 3.5.1 (2018-07-02)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: CentOS Linux 7 (Core)
#> 
#> Matrix products: default
#> BLAS/LAPACK: /pstore/apps/OpenBLAS/0.2.13-GCC-4.8.4-LAPACK-3.5.0/lib/libopenblas_prescottp-r0.2.13.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mlr_2.13          ParamHelpers_1.12
#> 
#> loaded via a namespace (and not attached):
#>  [1] parallelMap_1.3       Rcpp_1.0.0            compiler_3.5.1       
#>  [4] pillar_1.3.1          highr_0.7             plyr_1.8.4           
#>  [7] bindr_0.1.1           tools_3.5.1           digest_0.6.18        
#> [10] lattice_0.20-38       evaluate_0.12         tibble_2.0.1         
#> [13] gtable_0.2.0          checkmate_1.9.1       pkgconfig_2.0.2      
#> [16] rlang_0.3.1           Matrix_1.2-15         fastmatch_1.1-0      
#> [19] parallel_3.5.1        yaml_2.2.0            xfun_0.4             
#> [22] bindrcpp_0.2.2        stringr_1.3.1         dplyr_0.7.8          
#> [25] knitr_1.21            randomForestSRC_2.8.0 rprojroot_1.3-2      
#> [28] grid_3.5.1            tidyselect_0.2.5      glue_1.3.0           
#> [31] data.table_1.12.0     R6_2.3.0              XML_3.98-1.13        
#> [34] survival_2.42-6       rmarkdown_1.10        ggplot2_3.1.0        
#> [37] purrr_0.3.0           magrittr_1.5          splines_3.5.1        
#> [40] backports_1.1.3       scales_1.0.0          BBmisc_1.11          
#> [43] htmltools_0.3.6       assertthat_0.2.0      colorspace_1.4-0     
#> [46] stringi_1.2.4         lazyeval_0.2.1        munsell_0.5.0        
#> [49] crayon_1.3.4

Created on 2019-02-11 by the reprex package (v0.2.0).

Bug report

mb706 commented 5 years ago

The problem seems to be the makeModelMultiplexerParamSet, which has also some other problems. There is no magic happening inside, in theory you could actually construct the ParamSet manually (but would have to worry about putting in the right requires). Another solution would be to construct the ParamSet before adding any more wrappers to the multiplexer:

lrn <- makeModelMultiplexer(base.learners)
ps <- makeModelMultiplexerParamSet(lrn,
  makeDiscreteParam('ntree', values=c(100, 500))
)
lrn <- makeRemoveConstantFeaturesWrapper(lrn)