OSCA-source / OSCA.basic

Basics of the OSCA book
10 stars 11 forks source link

'rand > 0.9 is not TRUE' error in BioC 3.15 and 3.16 #4

Closed hpages closed 1 month ago

hpages commented 2 years ago

https://bioconductor.org/checkResults/3.15/books-LATEST/OSCA.basic/nebbiolo1-buildsrc.html https://bioconductor.org/checkResults/3.16/books-LATEST/OSCA.basic/nebbiolo2-buildsrc.html

Using this issue to share progress on this and discuss a fix.

Code from the OSCA.basic book that leads to this error:

# Required packages: scRNAseq, scater, org.Mm.eg.db, scran, GSEABase, AUCell, bluster.

# Data loading

library(scRNAseq)
sce.zeisel <- ZeiselBrainData()

library(scater)
sce.zeisel <- aggregateAcrossFeatures(sce.zeisel,
    id=sub("_loc[0-9]+$", "", rownames(sce.zeisel)))

library(org.Mm.eg.db)
rowData(sce.zeisel)$Ensembl <- mapIds(org.Mm.eg.db,
    keys=rownames(sce.zeisel), keytype="SYMBOL", column="ENSEMBL")

# Quality control

stats <- perCellQCMetrics(sce.zeisel, subsets=list(
    Mt=rowData(sce.zeisel)$featureType=="mito"))
qc <- quickPerCellQC(stats, percent_subsets=c("altexps_ERCC_percent",
    "subsets_Mt_percent"))
sce.zeisel <- sce.zeisel[,!qc$discard]

# Normalization

library(scran)
set.seed(1000)
clusters <- quickCluster(sce.zeisel)
sce.zeisel <- computeSumFactors(sce.zeisel, cluster=clusters)
sce.zeisel <- logNormCounts(sce.zeisel)

# Assigning cell labels from gene sets

library(scran)
wilcox.z <- pairwiseWilcox(sce.zeisel, sce.zeisel$level1class, 
    lfc=1, direction="up")
markers.z <- getTopMarkers(wilcox.z$statistics, wilcox.z$pairs,
    pairwise=FALSE, n=50)
lengths(markers.z)

library(scRNAseq)
sce.tasic <- TasicBrainData()

library(GSEABase)
all.sets <- lapply(names(markers.z), function(x) {
    GeneSet(markers.z[[x]], setName=x)
})
all.sets <- GeneSetCollection(all.sets)

library(AUCell)
rankings <- AUCell_buildRankings(counts(sce.tasic),
    plotStats=FALSE, verbose=FALSE)
cell.aucs <- AUCell_calcAUC(all.sets, rankings)
results <- t(assay(cell.aucs))
head(results)

new.labels <- colnames(results)[max.col(results)]

library(bluster)
rand <- pairwiseRand(new.labels, sce.tasic$broad_type, mode="index")
rand
# [1] 0.02957151

stopifnot(rand > 0.9)

Run in about 1 min. on my laptop (Ubuntu 22.04 LTS, 16Gb of RAM).

The new rand value (0.02) is a drastic drop from the original one!

hpages commented 2 years ago

sessionInfo():

R version 4.2.0 Patched (2022-05-04 r82318)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-4.2.r82318/lib/libRblas.so
LAPACK: /home/hpages/R/R-4.2.r82318/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] bluster_1.7.0               AUCell_1.19.0              
 [3] GSEABase_1.59.0             graph_1.75.0               
 [5] annotate_1.75.0             XML_3.99-0.9               
 [7] scran_1.25.0                org.Mm.eg.db_3.15.0        
 [9] AnnotationDbi_1.59.0        scater_1.25.1              
[11] ggplot2_3.3.6               scuttle_1.7.0              
[13] scRNAseq_2.11.0             SingleCellExperiment_1.19.0
[15] SummarizedExperiment_1.27.1 Biobase_2.57.0             
[17] GenomicRanges_1.49.0        GenomeInfoDb_1.33.3        
[19] IRanges_2.31.0              S4Vectors_0.35.0           
[21] BiocGenerics_0.43.0         MatrixGenerics_1.9.0       
[23] matrixStats_0.62.0         

loaded via a namespace (and not attached):
  [1] AnnotationHub_3.5.0           BiocFileCache_2.5.0          
  [3] igraph_1.3.1                  lazyeval_0.2.2               
  [5] BiocParallel_1.31.3           digest_0.6.29                
  [7] ensembldb_2.21.1              htmltools_0.5.2              
  [9] viridis_0.6.2                 fansi_1.0.3                  
 [11] magrittr_2.0.3                memoise_2.0.1                
 [13] ScaledMatrix_1.5.0            cluster_2.1.3                
 [15] limma_3.53.0                  Biostrings_2.65.0            
 [17] R.utils_2.11.0                prettyunits_1.1.1            
 [19] colorspace_2.0-3              blob_1.2.3                   
 [21] rappdirs_0.3.3                ggrepel_0.9.1                
 [23] dplyr_1.0.9                   crayon_1.5.1                 
 [25] RCurl_1.98-1.6                glue_1.6.2                   
 [27] gtable_0.3.0                  zlibbioc_1.43.0              
 [29] XVector_0.37.0                DelayedArray_0.23.1          
 [31] BiocSingular_1.13.0           scales_1.2.0                 
 [33] DBI_1.1.2                     edgeR_3.39.1                 
 [35] Rcpp_1.0.8.3                  viridisLite_0.4.0            
 [37] xtable_1.8-4                  progress_1.2.2               
 [39] dqrng_0.3.0                   bit_4.0.4                    
 [41] rsvd_1.0.5                    metapod_1.5.0                
 [43] httr_1.4.3                    ellipsis_0.3.2               
 [45] pkgconfig_2.0.3               R.methodsS3_1.8.1            
 [47] dbplyr_2.1.1                  locfit_1.5-9.5               
 [49] utf8_1.2.2                    tidyselect_1.1.2             
 [51] rlang_1.0.2                   later_1.3.0                  
 [53] munsell_0.5.0                 BiocVersion_3.16.0           
 [55] tools_4.2.0                   cachem_1.0.6                 
 [57] cli_3.3.0                     generics_0.1.2               
 [59] RSQLite_2.2.14                ExperimentHub_2.5.0          
 [61] stringr_1.4.0                 fastmap_1.1.0                
 [63] yaml_2.3.5                    bit64_4.0.5                  
 [65] purrr_0.3.4                   KEGGREST_1.37.0              
 [67] AnnotationFilter_1.21.0       sparseMatrixStats_1.9.0      
 [69] mime_0.12                     R.oo_1.24.0                  
 [71] xml2_1.3.3                    biomaRt_2.53.1               
 [73] compiler_4.2.0                beeswarm_0.4.0               
 [75] filelock_1.0.2                curl_4.3.2                   
 [77] png_0.1-7                     interactiveDisplayBase_1.35.0
 [79] tibble_3.1.7                  statmod_1.4.36               
 [81] stringi_1.7.6                 GenomicFeatures_1.49.1       
 [83] lattice_0.20-45               ProtGenerics_1.29.0          
 [85] Matrix_1.4-1                  vctrs_0.4.1                  
 [87] pillar_1.7.0                  lifecycle_1.0.1              
 [89] BiocManager_1.30.17           BiocNeighbors_1.15.0         
 [91] data.table_1.14.2             bitops_1.0-7                 
 [93] irlba_2.3.5                   httpuv_1.6.5                 
 [95] rtracklayer_1.57.0            R6_2.5.1                     
 [97] BiocIO_1.7.1                  promises_1.2.0.1             
 [99] gridExtra_2.3                 vipor_0.4.5                  
[101] assertthat_0.2.1              rjson_0.2.21                 
[103] withr_2.5.0                   GenomicAlignments_1.33.0     
[105] Rsamtools_2.13.1              GenomeInfoDbData_1.2.8       
[107] parallel_4.2.0                hms_1.1.1                    
[109] grid_4.2.0                    beachmat_2.13.0              
[111] DelayedMatrixStats_1.19.0     shiny_1.7.1                  
[113] ggbeeswarm_0.6.0              restfulr_0.0.13              
hpages commented 2 years ago

And the culprit is... a revamping of the AUCell::AUCell_buildRankings() generic and methods between AUCell 1.17.0 and 1.18.0! What's scary is that the function now seems to produce completely different results. Taking a closer look now...

hpages commented 2 years ago

Found it! AUCell::AUCell_buildRankings() now ranks genes from lowest to highest expression instead of from highest to lowest expression. See https://github.com/aertslab/AUCell/issues/27

hpages commented 2 years ago

@LTLA @vjcitn They say they've fixed AUCell. Let's check tomorrow's build report for OSCA.basic :crossed_fingers:

hpages commented 2 years ago

Still broken :disappointed: Now it's because of this other regression introduced in the latest AUCell.

hpages commented 2 years ago

@PeteHaitch @lgeistlinger @Alanocallaghan As mentioned on Slack, this issue is the last thing preventing the OSCA sub-books from being all green on the build reports:

I've tried one more time to convince the AUCell developers to avoid the kind of breaking change that they've introduced in the latest version of their package. But that's it. My 2-week interim of maintaining the OSCA book ends here :wink:

I hope you guys can take it from there.

Thanks again for volunteering and let me know here or on Slack if you have any question.

H.

P.S.: Also please don't forget to update the maintainer in the DESCRIPTION files of the sub-books. Thanks again!

hpages commented 1 month ago

This seems to have been addressed for a while. Closing now...