mlr-org / mlr3viz

Visualizations for mlr3
https://mlr3viz.mlr-org.com
GNU Lesser General Public License v3.0
42 stars 8 forks source link

wrong order of x-axis of autoplot when used in mlr3fselect #126

Closed ayueme closed 2 months ago

ayueme commented 10 months ago

Dear developers: I'm learning the aritcle: Recursive Feature Elimination on the Sonar Data Set. Here's the code:

library(mlr3verse)
library(mlr3fselect)
library(ggplot2)
library(data.table)

task = tsk("sonar")
optimizer = fs("rfe", n_features = 1, feature_number = 1)

learner = lrn("classif.gbm", distribution = "bernoulli",   predict_type = "prob")

instance = fsi(
  task = task,
  learner = learner,
  resampling = rsmp("cv", folds = 6),
  measures = msr("classif.auc"),
  terminator = trm("none"),
  store_models=T)

lgr::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn")

set.seed(123)
optimizer$optimize(instance)

library(viridisLite)
library(mlr3misc)

data = as.data.table(instance$archive)
data[, n:= map_int(importance, length)]

ggplot(data, aes(x = n, y = classif.auc)) +
  geom_line(
    color = viridis(1, begin = 0.5),
    linewidth = 1) +
  geom_point(
    fill = viridis(1, begin = 0.5),
    shape = 21,
    size = 3,
    stroke = 0.5,
    alpha = 0.8) +
  xlab("Number of Features") +
  scale_x_reverse() +
  theme_minimal()

Here's the picture: Snipaste_2023-10-31_17-52-54

but when I used autoplot(instance, type = "performance"), it gave me the following picture: Snipaste_2023-10-31_17-53-18

The order of x-axis is just opposite。 Is there something wrong?

my session info:

R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.utf8 
[2] LC_CTYPE=Chinese (Simplified)_China.utf8   
[3] LC_MONETARY=Chinese (Simplified)_China.utf8
[4] LC_NUMERIC=C                               
[5] LC_TIME=Chinese (Simplified)_China.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] mlr3misc_0.12.0    viridisLite_0.4.2  data.table_1.14.8 
[4] ggplot2_3.4.2      mlr3fselect_0.11.0 mlr3verse_0.2.8   
[7] mlr3_0.16.1       

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.10             paradox_0.11.1         
 [3] lattice_0.20-45         listenv_0.9.0          
 [5] palmerpenguins_0.1.1    digest_0.6.31          
 [7] utf8_1.2.3              parallelly_1.36.0      
 [9] R6_2.5.1                spacefillr_0.3.2       
[11] backports_1.4.1         mlr3measures_0.5.0     
[13] pillar_1.9.0            rlang_1.1.0            
[15] uuid_1.1-0              rstudioapi_0.14        
[17] Matrix_1.5-1            checkmate_2.2.0        
[19] labeling_0.4.2          splines_4.2.2          
[21] mlr3pipelines_0.4.2     mlr3hyperband_0.4.4    
[23] munsell_0.5.0           compiler_4.2.2         
[25] pkgconfig_2.0.3         gbm_2.1.8.1            
[27] globals_0.16.2          mlr3tuning_0.18.0.9000 
[29] tidyselect_1.2.0        gridExtra_2.3          
[31] tibble_3.2.1            mlr3data_0.6.1         
[33] lgr_0.4.4               mlr3cluster_0.1.6      
[35] mlr3tuningspaces_0.3.3  codetools_0.2-18       
[37] clusterCrit_1.2.8       fansi_1.0.4            
[39] future_1.32.0           crayon_1.5.2           
[41] dplyr_1.1.2             withr_2.5.0            
[43] grid_4.2.2              gtable_0.3.3           
[45] lifecycle_1.0.3         magrittr_2.0.3         
[47] scales_1.2.1            mlr3learners_0.5.6     
[49] future.apply_1.11.0     cli_3.6.0              
[51] farver_2.1.1            mlr3viz_0.6.1          
[53] viridis_0.6.3           mlr3filters_0.7.0      
[55] bbotk_0.7.2             generics_0.1.3         
[57] vctrs_0.6.3             tools_4.2.2            
[59] glue_1.6.2              parallel_4.2.2         
[61] survival_3.4-0          clue_0.3-64            
[63] colorspace_2.1-0        cluster_2.1.4          
[65] mlr3extralearners_0.7.0 mlr3mbo_0.2.1 
bblodfon commented 2 months ago

Hi @ayueme,

Sorry for the late answer!

The order of the x-axis is correct, it's the interpretation that is different: in the first plot it is really the number of features, in autoplot, each batch refers to each iteration in the RFE in this case and since you initialized with fs("rfe", n_features = 1, feature_number = 1) and sonar has 60 features in total, each batch "takes care" of each feature until 1 remains. So the 0 batch is all features, batch 1 is 59 features and so on.