MarioniLab / scran

Clone of the Bioconductor repository for the scran package.
https://bioconductor.org/packages/devel/bioc/html/scran.html
40 stars 22 forks source link

`scoreMarkers()`: Parallelisation changes `rank.*` statistics #111

Closed PeteHaitch closed 1 year ago

PeteHaitch commented 1 year ago

It looks like the results aren't being correctly combined when parallelisation is used with scoreMarkers(). findMarkers() doesn't seem to have the same issue.

suppressPackageStartupMessages(library(scran))
suppressPackageStartupMessages(library(scuttle))
sce <- mockSCE()
sce <- logNormCounts(sce)
# k=4 clusters
kout <- kmeans(t(logcounts(sce)), centers=4) 

# scoreMarkers() without/with parallelisation
out <- scoreMarkers(sce, groups=kout$cluster)
out2 <- scoreMarkers(sce, groups=kout$cluster, BPPARAM=BiocParallel::MulticoreParam(2))
# findMarkers() without/with parallelisation
m <- findMarkers(sce, groups = kout$cluster)
m2 <- findMarkers(sce, groups = kout$cluster, BPPARAM=BiocParallel::MulticoreParam(2))

# At most k-1=3 genes can have a rank of 1, but not true when paralellisation 
# is used with scoreMarkers
sum(out[[1]]$rank.AUC == 1)
#> [1] 3
sum(out2[[1]]$rank.AUC == 1)
#> [1] 6
sum(m[[1]]$Top == 1)
#> [1] 3
sum(m2[[1]]$Top == 1)
#> [1] 3

Created on 2023-07-17 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.3.1 (2023-06-16) #> os Ubuntu 22.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language en_AU:en #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2023-07-17 #> pandoc 3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> beachmat 2.16.0 2023-04-25 [3] Bioconductor #> Biobase * 2.60.0 2023-04-25 [3] Bioconductor #> BiocGenerics * 0.46.0 2023-04-25 [3] Bioconductor #> BiocNeighbors 1.18.0 2023-04-25 [3] Bioconductor #> BiocParallel 1.34.2 2023-05-22 [1] Bioconductor #> BiocSingular 1.16.0 2023-04-25 [3] Bioconductor #> bitops 1.0-7 2021-04-24 [3] RSPM (R 4.2.0) #> bluster 1.10.0 2023-04-25 [3] Bioconductor #> cli 3.6.1 2023-03-23 [3] RSPM (R 4.2.0) #> cluster 2.1.4 2022-08-22 [3] RSPM (R 4.2.0) #> codetools 0.2-19 2023-02-01 [3] RSPM (R 4.2.0) #> crayon 1.5.2 2022-09-29 [3] RSPM (R 4.2.0) #> DelayedArray 0.26.3 2023-05-22 [1] Bioconductor #> DelayedMatrixStats 1.22.1 2023-06-09 [1] Bioconductor #> digest 0.6.33 2023-07-07 [3] RSPM (R 4.2.0) #> dqrng 0.3.0 2021-05-01 [3] CRAN (R 4.1.1) #> edgeR 3.42.4 2023-05-31 [3] Bioconductor #> evaluate 0.21 2023-05-05 [3] RSPM (R 4.2.0) #> fastmap 1.1.1 2023-02-24 [3] RSPM (R 4.2.0) #> fs 1.6.2 2023-04-25 [3] RSPM (R 4.2.0) #> GenomeInfoDb * 1.36.1 2023-06-21 [3] Bioconductor #> GenomeInfoDbData 1.2.10 [3] Bioconductor #> GenomicRanges * 1.52.0 2023-04-25 [3] Bioconductor #> glue 1.6.2 2022-02-24 [3] RSPM (R 4.2.0) #> htmltools 0.5.5 2023-03-23 [3] RSPM (R 4.2.0) #> igraph 1.5.0 2023-06-16 [1] CRAN (R 4.3.0) #> IRanges * 2.34.1 2023-06-22 [3] Bioconductor #> irlba 2.3.5.1 2022-10-03 [3] RSPM (R 4.2.0) #> knitr 1.43 2023-05-25 [3] RSPM (R 4.2.0) #> lattice 0.21-8 2023-04-05 [3] RSPM (R 4.2.0) #> lifecycle 1.0.3 2022-10-07 [3] RSPM (R 4.2.0) #> limma 3.56.2 2023-06-04 [1] Bioconductor #> locfit 1.5-9.8 2023-06-11 [3] RSPM (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [3] RSPM (R 4.2.0) #> Matrix 1.6-0 2023-07-08 [3] RSPM (R 4.2.0) #> MatrixGenerics * 1.12.2 2023-06-09 [1] Bioconductor #> matrixStats * 1.0.0 2023-06-02 [3] RSPM (R 4.2.0) #> metapod 1.8.0 2023-04-25 [3] Bioconductor #> pkgconfig 2.0.3 2019-09-22 [3] CRAN (R 4.0.1) #> purrr 1.0.1 2023-01-10 [3] RSPM (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [3] RSPM (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [3] RSPM (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [3] RSPM (R 4.2.0) #> R.utils 2.12.2 2022-11-11 [3] RSPM (R 4.2.0) #> Rcpp 1.0.11 2023-07-06 [3] RSPM (R 4.2.0) #> RCurl 1.98-1.12 2023-03-27 [3] RSPM (R 4.2.0) #> reprex 2.0.2 2022-08-17 [3] RSPM (R 4.2.0) #> rlang 1.1.1 2023-04-28 [3] RSPM (R 4.2.0) #> rmarkdown 2.23 2023-07-01 [3] RSPM (R 4.2.0) #> rstudioapi 0.15.0 2023-07-07 [3] RSPM (R 4.2.0) #> rsvd 1.0.5 2021-04-16 [3] RSPM (R 4.2.0) #> S4Arrays 1.0.4 2023-05-14 [1] Bioconductor #> S4Vectors * 0.38.1 2023-05-02 [3] Bioconductor #> ScaledMatrix 1.8.1 2023-05-03 [1] Bioconductor #> scran * 1.28.1 2023-05-02 [1] Bioconductor #> scuttle * 1.10.1 2023-05-02 [1] Bioconductor #> sessioninfo 1.2.2 2021-12-06 [3] RSPM (R 4.2.0) #> SingleCellExperiment * 1.22.0 2023-04-25 [3] Bioconductor #> sparseMatrixStats 1.12.0 2023-04-25 [3] Bioconductor #> statmod 1.5.0 2023-01-06 [3] RSPM (R 4.2.0) #> styler 1.10.1 2023-06-05 [1] CRAN (R 4.3.0) #> SummarizedExperiment * 1.30.2 2023-06-06 [3] Bioconductor #> vctrs 0.6.3 2023-06-14 [3] RSPM (R 4.2.0) #> withr 2.5.0 2022-03-03 [3] RSPM (R 4.2.0) #> xfun 0.39 2023-04-20 [3] RSPM (R 4.2.0) #> XVector 0.40.0 2023-04-25 [3] Bioconductor #> yaml 2.3.7 2023-01-23 [3] RSPM (R 4.2.0) #> zlibbioc 1.46.0 2023-04-25 [3] Bioconductor #> #> [1] /home/peter/R/x86_64-pc-linux-gnu-library/4.3 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```
LTLA commented 1 year ago

Oops. Should be fixed by f842307b30a5f43a7874bd99174584349d0533c4 in 1.29.1 and 1.28.2.

PeteHaitch commented 1 year ago

Thanks!