OSCA-source / OSCA.advanced

Advanced steps of the OSCA book
11 stars 10 forks source link

Cell cycle chapter broken in BioC 3.18 #13

Closed PeteHaitch closed 10 months ago

PeteHaitch commented 10 months ago

@lgeistlinger I think this has been broken for a while: https://www.bioconductor.org/checkResults/3.18/books-LATEST/OSCA.advanced/nebbiolo2-buildsrc.html. I can reproduce the error on my Ubuntu laptop running BioC 3.18 (session info below). Stepping through inst/book/cell-cycle.Rmd, I find a few differences in the output to what is currently published to https://bioconductor.org/books/3.18/OSCA.advanced/cell-cycle-assignment.html. Not all differences results in errors.

Hope you can decide on a fix and that my suggestions are helpful. I'm away at a conference from today and until the week after next.


Difference 1

This does not result in an error in the build.

Part A

Code chunk

https://github.com/OSCA-source/OSCA.advanced/blob/9c6309bd315d4e35d0b7e0d6390ba46508107df5/inst/book/cell-cycle.Rmd#L137-L148

Current published output

##      
##        1  2  3  4
##   G1  66  6 15  1
##   G2M  2 61  2 13
##   S   10  2  7  0

Output on my machine

       1  2  3  4
  G1  66  5 13  0
  G2M  1 64  3 14
  S   11  0  8  0

Hypothesis

Cells per cluster are identical, so I suspect changes to SingleR are responsible for this different output.

Summary

Small changes in the SingleR output are not affecting the summary in the text, so it's probably fine just to use the new results. You might want to add a sanity check that the code results continue to match the text.

Part B

Code chunk

https://github.com/OSCA-source/OSCA.advanced/blob/9c6309bd315d4e35d0b7e0d6390ba46508107df5/inst/book/cell-cycle.Rmd#L183-L186

Current published output

## 
##  Pearson's Chi-squared test
## 
## data:  tab[, 1:2]
## X-squared = 110, df = 2, p-value <2e-16

Output on my machine


    Pearson's Chi-squared test

data:  tab[, 1:2]
X-squared = 124, df = 2, p-value <2e-16

Hypothesis

This is a consequence of Part A.


Difference 2

This is the error that causes the build to fail.

Code chunk

https://github.com/OSCA-source/OSCA.advanced/blob/9c6309bd315d4e35d0b7e0d6390ba46508107df5/inst/book/cell-cycle.Rmd#L224-L233

Current published output

The results of this code chunk are hidden, but the key point is that it used to be true that rand > 0.6.

Output on my machine

> stopifnot(rand > 0.6)
Error: rand > 0.6 is not TRUE
> rand
[1] 0.5792

Hypothesis

cyclone-assigned cell cycle phases appear identical, based on:

https://github.com/OSCA-source/OSCA.advanced/blob/9c6309bd315d4e35d0b7e0d6390ba46508107df5/inst/book/cell-cycle.Rmd#L221

giving identical results on my machine to the published version, so I again suspect changes to SingleR are responsible for this different output.

Summary

Small changes in the SingleR output are giving slightly different Rand indices, but this doesn't really affect the summary in the main text, so it's probably fine just to use the new results. You will need to modify the existing sanity check to allow for the new value of rand.


Difference 3

This does not result in an error in the build.

Code chunk

https://github.com/OSCA-source/OSCA.advanced/blob/9c6309bd315d4e35d0b7e0d6390ba46508107df5/inst/book/cell-cycle.Rmd#L394-L421

Current published output

image

Output on my machine

image

Hypothesis

Although a seed for random number generation is set in this code chunk and prior chunks that it depends on, the vagaries of (potentially stochastic) and numerical precision of PCA/cPCA/t-SNE and/or small changes to code underlying these functions may have lead to the small differences in the t-SNE plots (beyond the simple mirroring

Summary

Small changes in code underlying the PCA/cPCA/t-SNE have resulted in changes to the t-SNE, but this doesn't really affect the summary in the main text, so it's probably fine just to use the new results.


Session info

R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

time zone: Australia/Melbourne
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] scPCA_1.15.1                BiocSingular_1.17.1        
 [3] ResidualMatrix_1.11.0       batchelor_1.17.2           
 [5] bluster_1.11.4              SingleR_2.3.7              
 [7] org.Mm.eg.db_3.18.0         ensembldb_2.25.1           
 [9] AnnotationFilter_1.25.0     GenomicFeatures_1.53.3     
[11] AnnotationDbi_1.63.2        Biobase_2.61.0             
[13] GenomeInfoDb_1.37.7         BiocGenerics_0.47.1        
[15] scRNAseq_2.15.0             scran_1.29.3               
[17] scater_1.29.4               ggplot2_3.4.4              
[19] scuttle_1.11.3              SummarizedExperiment_1.31.1
[21] SingleCellExperiment_1.23.0 IRanges_2.35.3             
[23] GenomicRanges_1.53.3        S4Vectors_0.39.3           
[25] BiocStyle_2.29.2            rebook_1.11.1              

loaded via a namespace (and not attached):
  [1] later_1.3.1                   BiocIO_1.11.0                
  [3] bitops_1.0-7                  filelock_1.0.2               
  [5] tibble_3.2.1                  CodeDepends_0.6.5            
  [7] graph_1.79.4                  XML_3.99-0.14                
  [9] lifecycle_1.0.3               Rdpack_2.5                   
 [11] edgeR_3.99.6                  globals_0.16.2               
 [13] lattice_0.22-4                magrittr_2.0.3               
 [15] limma_3.57.11                 sass_0.4.7                   
 [17] rmarkdown_2.25                jquerylib_0.1.4              
 [19] yaml_2.3.7                    metapod_1.9.0                
 [21] httpuv_1.6.12                 cowplot_1.1.1                
 [23] DBI_1.1.3                     RColorBrewer_1.1-3           
 [25] abind_1.4-5                   zlibbioc_1.47.0              
 [27] Rtsne_0.16                    purrr_1.0.2                  
 [29] RCurl_1.98-1.12               rappdirs_0.3.3               
 [31] GenomeInfoDbData_1.2.11       ggrepel_0.9.4                
 [33] irlba_2.3.5.1                 listenv_0.9.0                
 [35] pheatmap_1.0.12               RSpectra_0.16-1              
 [37] parallelly_1.36.0             dqrng_0.3.1                  
 [39] DelayedMatrixStats_1.23.9     codetools_0.2-19             
 [41] DelayedArray_0.27.10          xml2_1.3.5                   
 [43] tidyselect_1.2.0              farver_2.1.1                 
 [45] ScaledMatrix_1.9.1            viridis_0.6.4                
 [47] matrixStats_1.0.0             BiocFileCache_2.9.1          
 [49] GenomicAlignments_1.37.0      jsonlite_1.8.7               
 [51] BiocNeighbors_1.19.0          ellipsis_0.3.2               
 [53] tools_4.3.1                   progress_1.2.2               
 [55] Rcpp_1.0.11                   glue_1.6.2                   
 [57] gridExtra_2.3                 SparseArray_1.1.12           
 [59] xfun_0.40                     MatrixGenerics_1.13.2        
 [61] dplyr_1.1.3                   withr_2.5.1                  
 [63] BiocManager_1.30.22           fastmap_1.1.1                
 [65] fansi_1.0.5                   digest_0.6.33                
 [67] rsvd_1.0.5                    R6_2.5.1                     
 [69] mime_0.12                     colorspace_2.1-0             
 [71] biomaRt_2.57.1                RSQLite_2.3.1                
 [73] utf8_1.2.4                    generics_0.1.3               
 [75] data.table_1.14.8             rtracklayer_1.61.2           
 [77] prettyunits_1.2.0             httr_1.4.7                   
 [79] S4Arrays_1.1.6                pkgconfig_2.0.3              
 [81] gtable_0.3.4                  blob_1.2.4                   
 [83] XVector_0.41.2                htmltools_0.5.6.1            
 [85] bookdown_0.36                 ProtGenerics_1.33.1          
 [87] scales_1.2.1                  png_0.1-8                    
 [89] knitr_1.44                    rstudioapi_0.15.0            
 [91] rjson_0.2.21                  curl_5.1.0                   
 [93] cachem_1.0.8                  stringr_1.5.0                
 [95] BiocVersion_3.18.0            parallel_4.3.1               
 [97] vipor_0.4.5                   restfulr_0.0.15              
 [99] pillar_1.9.0                  grid_4.3.1                   
[101] vctrs_0.6.4                   promises_1.2.1               
[103] origami_1.0.7                 dbplyr_2.3.4                 
[105] beachmat_2.17.17              xtable_1.8-4                 
[107] cluster_2.1.4                 beeswarm_0.4.0               
[109] evaluate_0.22                 cli_3.6.1                    
[111] locfit_1.5-9.8                compiler_4.3.1               
[113] Rsamtools_2.17.0              rlang_1.1.1                  
[115] crayon_1.5.2                  future.apply_1.11.0          
[117] labeling_0.4.3                ggbeeswarm_0.7.2             
[119] stringi_1.7.12                viridisLite_0.4.2            
[121] BiocParallel_1.35.4           assertthat_0.2.1             
[123] munsell_0.5.0                 Biostrings_2.69.2            
[125] coop_0.6-3                    lazyeval_0.2.2               
[127] Matrix_1.6-1.1                dir.expiry_1.9.0             
[129] ExperimentHub_2.9.1           hms_1.1.3                    
[131] future_1.33.0                 sparseMatrixStats_1.13.4     
[133] bit64_4.0.5                   KEGGREST_1.41.4              
[135] statmod_1.5.0                 shiny_1.7.5.1                
[137] interactiveDisplayBase_1.39.0 AnnotationHub_3.9.2          
[139] kernlab_0.9-32                rbibutils_2.2.15             
[141] igraph_1.5.1                  memoise_2.0.1                
[143] bslib_0.5.1                   bit_4.0.5     
lgeistlinger commented 10 months ago

Thanks for the comprehensive report @PeteHaitch. I look into that.

lgeistlinger commented 10 months ago

fixed in 64f389c.