Bioconductor / MatrixGenerics

S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects
https://bioconductor.org/packages/MatrixGenerics
12 stars 1 forks source link

Mitigate non-backward comptible changes in matrixStats 1.0.0 [PR against RELEASE_3_17] #32

Closed hpages closed 1 year ago

hpages commented 1 year ago

Modify calls to matrixStats functions to mitigate non-backward comptible changes in matrixStats 1.0.0. More precisely:

PeteHaitch commented 1 year ago

Thanks, Hervé. I'll install this branch and get the RELEASE_3_17 version of DelayedMatrixStats working against it. I'll aim to get this done today/tomorrow and then turn my attention to the devel branch.

hpages commented 1 year ago

Sounds good. Hopefully you don't need to touch DelayedMatrixStats at all in RELEASE_3_17. That's what I'm trying to achieve with this PR.

hpages commented 1 year ago

Unless you have direct calls to matrixStats:: functions of course. Unfortunately it seems that you have some (e.g. https://github.com/PeteHaitch/DelayedMatrixStats/blob/166420114b974ca628ca900994717785ffbbdf99/R/colCollapse.R#L44). This makes you directly vulnerable to the breaking change in matrixStats 1.0.0 and whatever this PR is doing cannot protect you.

PeteHaitch commented 1 year ago

Is this the behaviour I should be seeing with this PR and RELEASE_3_17?

x <- matrix(
  -5:6, 
  nrow = 3,
  ncol = 4, 
  dimnames = list(paste0("R", 1:3), paste0("C", 1:4)))

matrixStats::colAlls(x)
#>    C1    C2    C3    C4 
#>  TRUE FALSE  TRUE  TRUE
MatrixGenerics::colAlls(x)
#> [1]  TRUE FALSE  TRUE  TRUE
Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.0 (2023-04-21) #> os Ubuntu 22.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language en_AU:en #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2023-06-06 #> pandoc 2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.1 2023-03-23 [3] RSPM (R 4.2.0) #> digest 0.6.31 2022-12-11 [3] RSPM (R 4.2.0) #> evaluate 0.21 2023-05-05 [3] RSPM (R 4.2.0) #> fastmap 1.1.1 2023-02-24 [3] RSPM (R 4.2.0) #> fs 1.6.2 2023-04-25 [3] RSPM (R 4.2.0) #> glue 1.6.2 2022-02-24 [3] RSPM (R 4.2.0) #> htmltools 0.5.5 2023-03-23 [3] RSPM (R 4.2.0) #> knitr 1.43 2023-05-25 [3] RSPM (R 4.2.0) #> lifecycle 1.0.3 2022-10-07 [3] RSPM (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [3] RSPM (R 4.2.0) #> MatrixGenerics 1.12.1 2023-06-05 [1] Github (Bioconductor/MatrixGenerics@18c91de) #> matrixStats 1.0.0 2023-06-02 [3] RSPM (R 4.2.0) #> purrr 1.0.1 2023-01-10 [3] RSPM (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [3] RSPM (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [3] RSPM (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [3] RSPM (R 4.2.0) #> R.utils 2.12.2 2022-11-11 [3] RSPM (R 4.2.0) #> reprex 2.0.2 2022-08-17 [3] RSPM (R 4.2.0) #> rlang 1.1.1 2023-04-28 [3] RSPM (R 4.2.0) #> rmarkdown 2.22 2023-06-01 [3] RSPM (R 4.2.0) #> rstudioapi 0.14 2022-08-22 [3] RSPM (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [3] RSPM (R 4.2.0) #> styler 1.10.1 2023-06-05 [1] CRAN (R 4.3.0) #> vctrs 0.6.2 2023-04-19 [3] RSPM (R 4.2.0) #> withr 2.5.0 2022-03-03 [3] RSPM (R 4.2.0) #> xfun 0.39 2023-04-20 [3] RSPM (R 4.2.0) #> yaml 2.3.7 2023-01-23 [3] RSPM (R 4.2.0) #> #> [1] /home/peter/R/x86_64-pc-linux-gnu-library/4.3 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
hpages commented 1 year ago

Yes, that's the behavior of MatrixGenerics::colAlls(x) since the BioC 3.17 release (and in earlier versions of BioC) so the PR preserves it. The idea is to avoid disruption in BioC 3.17 and to allow it in BioC 3.18.

PeteHaitch commented 1 year ago

Thanks for confirmation.

I think that means I will need to modify the unit tests in RELEASE_3_17 accordingly. Perhaps what I actually want is to test the behaviour against MatrixGenerics itself rather than matrixStats (which is how it's setup for historical reasons).

hpages commented 1 year ago

Absolutely. One of the benefits of having MatrixGenerics is that it allows us to write code that is isolated from what happens to matrixStats. We have full control over the colAlls() generic and methods defined in MatrixGenerics so we can choose how we want them to behave. Also our code is more robust and easier to maintain if we use things like MatrixGenerics::colAlls() (or just colAlls() if we import MatrixGenerics) rather than direct calls to matrixStats::colAlls(). Including in our unit tests.

PeteHaitch commented 1 year ago

Alright, I've got PR for RELEASE_3_17 branch of DelayedMatrixStats ready to go (https://github.com/PeteHaitch/DelayedMatrixStats/pull/93).

hpages commented 1 year ago

Note that sparseMatrixStats unit tests fail for the same reason in release (RELEASE_3_17), even with forthcoming MatrixGenerics v1.12.1, because FUN(sp_mat) is compared with matrixStats::FUN(mat) instead of FUN(mat):

> test_check("sparseMatrixStats")
[ FAIL 558 | WARN 106 | SKIP 22 | PASS 1195 ]
...
══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (test-functions.R:74:5): colSums works ──────────────────────────────
colSums2(sp_mat) not equal to matrixStats::colSums2(mat).
names for current but not for target
── Failure (test-functions.R:75:5): colSums works ──────────────────────────────
colSums2(sp_mat, na.rm = TRUE) not equal to matrixStats::colSums2(mat, na.rm = TRUE).
names for current but not for target
── Failure (test-functions.R:81:5): rowSums works ──────────────────────────────
rowSums2(sp_mat2) not equal to matrixStats::colSums2(mat).
names for current but not for target
...

sessionInfo():

R version 4.3.0 Patched (2023-05-31 r84481)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 23.04

Matrix products: default
BLAS:   /home/hpages/R/R-4.3.r84481/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sparseMatrixStats_1.12.0 testthat_3.1.8           MatrixGenerics_1.12.1   
[4] matrixStats_1.0.0       

loaded via a namespace (and not attached):
 [1] desc_1.4.2      R6_2.5.1        Matrix_1.5-4.1  lattice_0.21-8 
 [5] magrittr_2.0.3  lifecycle_1.0.3 cli_3.6.1       grid_4.3.0     
 [9] withr_2.5.0     compiler_4.3.0  rprojroot_2.0.3 tools_4.3.0    
[13] brio_1.1.3      Rcpp_1.0.10     rlang_1.1.1    
hpages commented 1 year ago

@PeteHaitch Is the PR good to go or do you need more time?

PeteHaitch commented 1 year ago

Good to go from my end

PeteHaitch commented 1 year ago

I've now merged PR for RELEASE_3_17 branch of DelayedMatrixStats (https://github.com/PeteHaitch/DelayedMatrixStats/pull/93) and pushed to BioC.