Initially, we thought it's a threshold issue, but that's not always the case as we get close to 100%.
possible culprit
We measure the significance of the improvement in error rate or correlation to decide the optimality. It ignores the bounded (truncated?) nature of the performance criteria (e.g. correlation is b/w [-1,1]).
possible solution
Adjust the measure based on a truncated distribution, (or transform current measures to take them into an unbounded space?)
many times users have complained that the chosen keepX is too strict. Here's an example for
tune.spca
for component 2:Created on 2020-10-02 by the reprex package (v0.3.0)
Session info
``` r devtools::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.0.2 (2020-06-22) #> os macOS Catalina 10.15 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2020-10-02 #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0) #> backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2) #> BiocParallel 1.23.2 2020-07-06 [1] Bioconductor #> callr 3.4.4 2020-09-07 [1] CRAN (R 4.0.2) #> cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.0) #> colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.0) #> corpcor 1.6.9 2017-04-01 [1] CRAN (R 4.0.0) #> crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.0) #> curl 4.3 2019-12-02 [1] CRAN (R 4.0.0) #> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0) #> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2) #> digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.0) #> dplyr 1.0.2 2020-08-18 [1] CRAN (R 4.0.2) #> ellipse 0.4.2 2020-05-27 [1] CRAN (R 4.0.2) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0) #> fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.0) #> farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2) #> generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.0) #> ggplot2 * 3.3.2 2020-06-19 [1] CRAN (R 4.0.2) #> ggrepel 0.8.2 2020-03-08 [1] CRAN (R 4.0.0) #> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2) #> gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.0) #> highr 0.8 2019-03-20 [1] CRAN (R 4.0.0) #> htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2) #> igraph 1.2.5 2020-03-19 [1] CRAN (R 4.0.0) #> knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2) #> labeling 0.3 2014-08-23 [1] CRAN (R 4.0.0) #> lattice * 0.20-41 2020-04-02 [1] CRAN (R 4.0.2) #> lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.0) #> magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2) #> MASS * 7.3-53 2020-09-09 [1] CRAN (R 4.0.2) #> Matrix 1.2-18 2019-11-27 [1] CRAN (R 4.0.2) #> matrixStats 0.57.0 2020-09-25 [1] CRAN (R 4.0.2) #> memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.0) #> mime 0.9 2020-02-04 [1] CRAN (R 4.0.0) #> mixOmics * 6.13.67 2017-02-06 [1] Bioconductor (R 4.0.2) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.0) #> pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2) #> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.2) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.0) #> pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.2) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.0.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0) #> processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.2) #> ps 1.3.4 2020-08-11 [1] CRAN (R 4.0.2) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0) #> R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.0) #> rARPACK 0.11-0 2016-03-10 [1] CRAN (R 4.0.0) #> RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.0.0) #> Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2) #> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2) #> reshape2 1.4.4 2020-04-09 [1] CRAN (R 4.0.0) #> rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2) #> rmarkdown 2.4 2020-09-30 [1] CRAN (R 4.0.2) #> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.0) #> RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.0.0) #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.0) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0) #> testthat 2.3.2 2020-03-02 [1] CRAN (R 4.0.0) #> tibble 3.0.3 2020-07-10 [1] CRAN (R 4.0.2) #> tidyr 1.1.2 2020-08-27 [1] CRAN (R 4.0.2) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.0) #> usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2) #> vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2) #> withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2) #> xfun 0.18 2020-09-29 [1] CRAN (R 4.0.2) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.0) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0) #> #> [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library ```Initially, we thought it's a threshold issue, but that's not always the case as we get close to 100%.
possible culprit
We measure the significance of the improvement in error rate or correlation to decide the optimality. It ignores the bounded (truncated?) nature of the performance criteria (e.g. correlation is b/w [-1,1]).
possible solution
Adjust the measure based on a truncated distribution, (or transform current measures to take them into an unbounded space?)