OSCA-source / OSCA.workflows

Workflows for the OSCA book.
2 stars 5 forks source link

Build failing in BioC 3.15 #1

Closed PeteHaitch closed 2 years ago

PeteHaitch commented 2 years ago

@LTLA Any idea what might be going on here? https://www.bioconductor.org/checkResults/3.15/books-LATEST/OSCA.workflows/nebbiolo1-buildsrc.html The muraro-pancreas.Rmd builds fine for me as a standalone doc, so I'm guessing it's tied up in the bookdown/rebook stuff that I'm still trying to wrap my head around. Any pointers appreciated - I'm trying to make sure OSCA.workflows builds properly in release and devel before the forthcoming release deadline.

alanocallaghan commented 2 years ago

I think that's the reason why there's a build in 3.15's basic as well maybe? Or at least odd coincidence it's on the same file. Builds fine for me locally as well http://bioconductor.org/checkResults/release/books-LATEST/OSCA.basic/nebbiolo1-buildsrc.html

PeteHaitch commented 2 years ago

I'm going to bump the version number of OSCA.workflows in 3.15 to trigger a new build in the hope it fixes itself.

If that doesn't work then I may need to ask @hpages for advice or to poke around on the build machine.

PeteHaitch commented 2 years ago

Still need to push to BioC's git server (https://github.com/OSCA-source/OSCA/issues/3#issuecomment-1275423528), but after some small changes it's now building in release (3.15) and devel (3.16) for me locally on macOS.

PeteHaitch commented 2 years ago

Now pushed both release and devel to BioC's git server. Crossing my fingers that it works and makes its way through the build system.

PeteHaitch commented 2 years ago

Partial success with those recent changes:

The logs for the most recent build on BioC 3.15 include

# Quitting from lines 23-33 (muraro-pancreas.Rmd) 
# Error in joinTwoTables(a = alreadyUsed, b = tab, mysql = mysql) : 
#   Table(s) gene_info can not be joined with genes!
# Calls: <Anonymous> ... joinQueryOnColumns2 -> joinQueryOnTables2 -> joinTwoTables
# 
# Execution halted

Tracing those lines, the error occurs during retrieval/loading of an EnsDb object from AnnotationHub, and I think the joinTwoTables() function is from the ensembldb package.

The following boils down the issue to a reprex, which I've tested on an up-to-date BioC 3.15 on macOS (Intel) and Linux (CentOS). In both cases it succeeds, so this seems specific to the BioC build machine. Perhaps it's an issue with a corrupted AnnotationHub resource AH73881 on the builder?

@hpages are you able to please investigate?

macOS (Intel)

suppressPackageStartupMessages(library(scRNAseq))
sce.muraro <- MuraroPancreasData()
#> snapshotDate(): 2022-04-26
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache

suppressPackageStartupMessages(library(AnnotationHub))
edb <- AnnotationHub()[["AH73881"]]
#> snapshotDate(): 2022-04-25
#> loading from cache
#> require("ensembldb")
gene.symb <- sub("__chr.*$", "", rownames(sce.muraro))
gene.ids <- mapIds(edb, keys=gene.symb, 
                   keytype="SYMBOL", column="GENEID")
#> Warning: Unable to map 2110 of 19059 requested IDs.

# Removing duplicated genes or genes without Ensembl IDs.
keep <- !is.na(gene.ids) & !duplicated(gene.ids)
sce.muraro <- sce.muraro[keep,]
rownames(sce.muraro) <- gene.ids[keep]

Created on 2022-10-13 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.1 (2022-06-23) #> os macOS Big Sur ... 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> ctype en_AU.UTF-8 #> tz Australia/Melbourne #> date 2022-10-13 #> pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> ! package * version date (UTC) lib source #> P AnnotationDbi * 1.58.0 2022-04-26 [?] Bioconductor #> P AnnotationFilter * 1.20.0 2022-04-26 [?] Bioconductor #> P AnnotationHub * 3.4.0 2022-04-26 [?] Bioconductor #> P assertthat 0.2.1 2019-03-21 [?] CRAN (R 4.2.0) #> P Biobase * 2.56.0 2022-04-26 [?] Bioconductor #> P BiocFileCache * 2.4.0 2022-04-26 [?] Bioconductor #> P BiocGenerics * 0.42.0 2022-04-26 [?] Bioconductor #> P BiocIO 1.6.0 2022-04-26 [?] Bioconductor #> P BiocManager 1.30.18 2022-05-18 [?] CRAN (R 4.2.0) #> P BiocParallel 1.30.4 2022-10-11 [?] Bioconductor #> P BiocVersion 3.15.2 2022-03-29 [?] Bioconductor #> P biomaRt 2.52.0 2022-04-26 [?] Bioconductor #> P Biostrings 2.64.1 2022-08-18 [?] Bioconductor #> P bit 4.0.4 2020-08-04 [?] CRAN (R 4.2.0) #> P bit64 4.0.5 2020-08-30 [?] CRAN (R 4.2.0) #> P bitops 1.0-7 2021-04-24 [?] CRAN (R 4.2.0) #> P blob 1.2.3 2022-04-10 [?] CRAN (R 4.2.0) #> P cachem 1.0.6 2021-08-19 [?] CRAN (R 4.2.0) #> P cli 3.4.1 2022-09-23 [?] CRAN (R 4.2.0) #> P codetools 0.2-18 2020-11-04 [3] CRAN (R 4.2.1) #> P crayon 1.5.2 2022-09-29 [?] CRAN (R 4.2.0) #> P curl 4.3.3 2022-10-06 [?] CRAN (R 4.2.0) #> P DBI 1.1.3 2022-06-18 [?] CRAN (R 4.2.0) #> P dbplyr * 2.2.1 2022-06-27 [?] CRAN (R 4.2.0) #> P DelayedArray 0.22.0 2022-04-26 [?] Bioconductor #> P digest 0.6.29 2021-12-01 [?] CRAN (R 4.2.0) #> P dplyr 1.0.10 2022-09-01 [?] CRAN (R 4.2.0) #> P ellipsis 0.3.2 2021-04-29 [?] CRAN (R 4.2.0) #> P ensembldb * 2.20.2 2022-06-21 [?] Bioconductor #> P evaluate 0.17 2022-10-07 [?] CRAN (R 4.2.0) #> P ExperimentHub 2.4.0 2022-04-26 [?] Bioconductor #> P fansi 1.0.3 2022-03-24 [?] CRAN (R 4.2.0) #> P fastmap 1.1.0 2021-01-25 [?] CRAN (R 4.2.0) #> P filelock 1.0.2 2018-10-05 [?] CRAN (R 4.2.0) #> P fs 1.5.2 2021-12-08 [?] CRAN (R 4.2.0) #> P generics 0.1.3 2022-07-05 [?] CRAN (R 4.2.0) #> P GenomeInfoDb * 1.32.4 2022-09-06 [?] Bioconductor #> P GenomeInfoDbData 1.2.8 2022-10-11 [?] Bioconductor #> P GenomicAlignments 1.32.1 2022-08-02 [?] Bioconductor #> P GenomicFeatures * 1.48.4 2022-09-20 [?] Bioconductor #> P GenomicRanges * 1.48.0 2022-04-26 [?] Bioconductor #> P glue 1.6.2 2022-02-24 [?] CRAN (R 4.2.0) #> P highr 0.9 2021-04-16 [?] CRAN (R 4.2.0) #> P hms 1.1.2 2022-08-19 [?] CRAN (R 4.2.0) #> P htmltools 0.5.3 2022-07-18 [?] CRAN (R 4.2.0) #> P httpuv 1.6.6 2022-09-08 [?] CRAN (R 4.2.0) #> P httr 1.4.4 2022-08-17 [?] CRAN (R 4.2.0) #> P interactiveDisplayBase 1.34.0 2022-04-26 [?] Bioconductor #> P IRanges * 2.30.1 2022-08-18 [?] Bioconductor #> P KEGGREST 1.36.3 2022-07-14 [?] Bioconductor #> P knitr 1.40 2022-08-24 [?] CRAN (R 4.2.0) #> P later 1.3.0 2021-08-18 [?] CRAN (R 4.2.0) #> P lattice 0.20-45 2021-09-22 [3] CRAN (R 4.2.1) #> P lazyeval 0.2.2 2019-03-15 [?] CRAN (R 4.2.0) #> P lifecycle 1.0.3 2022-10-07 [?] CRAN (R 4.2.0) #> P magrittr 2.0.3 2022-03-30 [?] CRAN (R 4.2.0) #> P Matrix 1.5-1 2022-09-13 [3] CRAN (R 4.2.0) #> P MatrixGenerics * 1.8.1 2022-06-30 [?] Bioconductor #> P matrixStats * 0.62.0 2022-04-19 [?] CRAN (R 4.2.0) #> P memoise 2.0.1 2021-11-26 [?] CRAN (R 4.2.0) #> P mime 0.12 2021-09-28 [?] CRAN (R 4.2.0) #> P pillar 1.8.1 2022-08-19 [?] CRAN (R 4.2.0) #> P pkgconfig 2.0.3 2019-09-22 [?] CRAN (R 4.2.0) #> P png 0.1-7 2013-12-03 [?] CRAN (R 4.2.0) #> P prettyunits 1.1.1 2020-01-24 [?] CRAN (R 4.2.0) #> P progress 1.2.2 2019-05-16 [?] CRAN (R 4.2.0) #> P promises 1.2.0.1 2021-02-11 [?] CRAN (R 4.2.0) #> P ProtGenerics 1.28.0 2022-04-26 [?] Bioconductor #> P purrr 0.3.5 2022-10-06 [?] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [3] CRAN (R 4.2.0) #> P R.methodsS3 1.8.2 2022-06-13 [?] CRAN (R 4.2.0) #> P R.oo 1.25.0 2022-06-12 [?] CRAN (R 4.2.0) #> P R.utils 2.12.0 2022-06-28 [?] CRAN (R 4.2.0) #> P R6 2.5.1 2021-08-19 [?] CRAN (R 4.2.0) #> P rappdirs 0.3.3 2021-01-31 [?] CRAN (R 4.2.0) #> P Rcpp 1.0.9 2022-07-08 [?] CRAN (R 4.2.0) #> P RCurl 1.98-1.9 2022-10-03 [?] CRAN (R 4.2.0) #> P reprex 2.0.2 2022-08-17 [?] CRAN (R 4.2.0) #> P restfulr 0.0.15 2022-06-16 [?] CRAN (R 4.2.0) #> P rjson 0.2.21 2022-01-09 [?] CRAN (R 4.2.0) #> P rlang 1.0.6 2022-09-24 [?] CRAN (R 4.2.0) #> P rmarkdown 2.17 2022-10-07 [?] CRAN (R 4.2.0) #> P Rsamtools 2.12.0 2022-04-26 [?] Bioconductor #> P RSQLite 2.2.18 2022-10-04 [?] CRAN (R 4.2.0) #> P rstudioapi 0.14 2022-08-22 [?] CRAN (R 4.2.0) #> P rtracklayer 1.56.1 2022-06-30 [?] Bioconductor #> P S4Vectors * 0.34.0 2022-04-26 [?] Bioconductor #> P scRNAseq * 2.10.0 2022-04-28 [?] Bioconductor #> P sessioninfo 1.2.2 2021-12-06 [?] CRAN (R 4.2.0) #> P shiny 1.7.2 2022-07-19 [?] CRAN (R 4.2.0) #> P SingleCellExperiment * 1.18.1 2022-10-02 [?] Bioconductor #> P stringi 1.7.8 2022-07-11 [?] CRAN (R 4.2.0) #> P stringr 1.4.1 2022-08-20 [?] CRAN (R 4.2.0) #> styler 1.7.0 2022-03-13 [3] CRAN (R 4.2.0) #> P SummarizedExperiment * 1.26.1 2022-05-01 [?] Bioconductor #> P tibble 3.1.8 2022-07-22 [?] CRAN (R 4.2.0) #> P tidyselect 1.2.0 2022-10-10 [?] CRAN (R 4.2.1) #> P utf8 1.2.2 2021-07-24 [?] CRAN (R 4.2.0) #> P vctrs 0.4.2 2022-09-29 [?] CRAN (R 4.2.0) #> P withr 2.5.0 2022-03-03 [?] CRAN (R 4.2.0) #> P xfun 0.33 2022-09-12 [?] CRAN (R 4.2.0) #> P XML 3.99-0.11 2022-10-03 [?] CRAN (R 4.2.0) #> P xml2 1.3.3 2021-11-30 [?] CRAN (R 4.2.0) #> P xtable 1.8-4 2019-04-21 [?] CRAN (R 4.2.0) #> P XVector 0.36.0 2022-04-26 [?] Bioconductor #> P yaml 2.3.5 2022-02-21 [?] CRAN (R 4.2.0) #> P zlibbioc 1.42.0 2022-04-26 [?] Bioconductor #> #> [1] /Users/Peter/Library/Caches/org.R-project.R/R/renv/library/OSCA.workflows-5d495549/R-4.2/x86_64-apple-darwin17.0 #> [2] /Users/Peter/GitHub/OSCA.workflows/renv/sandbox/R-4.2/x86_64-apple-darwin17.0/84ba8b13 #> [3] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> P ── Loaded and on-disk path mismatch. #> #> ────────────────────────────────────────────────────────────────────────────── ```

Linux (CentOS)

suppressPackageStartupMessages(library(scRNAseq))
sce.muraro <- MuraroPancreasData()
#> snapshotDate(): 2022-04-26
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache
#> see ?scRNAseq and browseVignettes('scRNAseq') for documentation
#> loading from cache

suppressPackageStartupMessages(library(AnnotationHub))
edb <- AnnotationHub()[["AH73881"]]
#> snapshotDate(): 2022-04-25
#> loading from cache
#> require("ensembldb")

gene.symb <- sub("__chr.*$", "", rownames(sce.muraro))
gene.ids <- mapIds(edb, keys=gene.symb, 
                   keytype="SYMBOL", column="GENEID")
#> Warning: Unable to map 2110 of 19059 requested IDs.

keep <- !is.na(gene.ids) & !duplicated(gene.ids)
sce.muraro <- sce.muraro[keep,]
rownames(sce.muraro) <- gene.ids[keep]

Created on 2022-10-13 with reprex v2.0.2

Session info ``` r sessionInfo() #> R version 4.2.1 (2022-06-23) #> Platform: x86_64-pc-linux-gnu (64-bit) #> Running under: CentOS Linux 7 (Core) #> #> Matrix products: default #> BLAS: /stornext/System/data/apps/R/R-4.2.1/lib64/R/lib/libRblas.so #> LAPACK: /stornext/System/data/apps/R/R-4.2.1/lib64/R/lib/libRlapack.so #> #> locale: #> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C #> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 #> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 #> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C #> [9] LC_ADDRESS=C LC_TELEPHONE=C #> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C #> #> attached base packages: #> [1] stats4 stats graphics grDevices utils datasets methods #> [8] base #> #> other attached packages: #> [1] ensembldb_2.20.2 AnnotationFilter_1.20.0 #> [3] GenomicFeatures_1.48.4 AnnotationDbi_1.58.0 #> [5] AnnotationHub_3.4.0 BiocFileCache_2.4.0 #> [7] dbplyr_2.2.1 scRNAseq_2.10.0 #> [9] SingleCellExperiment_1.18.1 SummarizedExperiment_1.26.1 #> [11] Biobase_2.56.0 GenomicRanges_1.48.0 #> [13] GenomeInfoDb_1.32.4 IRanges_2.30.1 #> [15] S4Vectors_0.34.0 BiocGenerics_0.42.0 #> [17] MatrixGenerics_1.8.1 matrixStats_0.62.0 #> #> loaded via a namespace (and not attached): #> [1] ProtGenerics_1.28.0 bitops_1.0-7 #> [3] fs_1.5.2 bit64_4.0.5 #> [5] filelock_1.0.2 progress_1.2.2 #> [7] httr_1.4.4 tools_4.2.1 #> [9] utf8_1.2.2 R6_2.5.1 #> [11] lazyeval_0.2.2 DBI_1.1.3 #> [13] withr_2.5.0 tidyselect_1.2.0 #> [15] prettyunits_1.1.1 bit_4.0.4 #> [17] curl_4.3.3 compiler_4.2.1 #> [19] cli_3.4.1 xml2_1.3.3 #> [21] DelayedArray_0.22.0 rtracklayer_1.56.1 #> [23] rappdirs_0.3.3 stringr_1.4.1 #> [25] digest_0.6.29 Rsamtools_2.12.0 #> [27] rmarkdown_2.17 XVector_0.36.0 #> [29] pkgconfig_2.0.3 htmltools_0.5.3 #> [31] fastmap_1.1.0 highr_0.9 #> [33] rlang_1.0.6 rstudioapi_0.14 #> [35] RSQLite_2.2.18 shiny_1.7.2 #> [37] BiocIO_1.6.0 generics_0.1.3 #> [39] BiocParallel_1.30.4 dplyr_1.0.10 #> [41] RCurl_1.98-1.9 magrittr_2.0.3 #> [43] GenomeInfoDbData_1.2.8 Matrix_1.5-1 #> [45] Rcpp_1.0.9 fansi_1.0.3 #> [47] lifecycle_1.0.3 stringi_1.7.8 #> [49] yaml_2.3.5 zlibbioc_1.42.0 #> [51] grid_4.2.1 blob_1.2.3 #> [53] parallel_4.2.1 promises_1.2.0.1 #> [55] ExperimentHub_2.4.0 crayon_1.5.2 #> [57] lattice_0.20-45 Biostrings_2.64.1 #> [59] hms_1.1.2 KEGGREST_1.36.3 #> [61] knitr_1.40 pillar_1.8.1 #> [63] rjson_0.2.21 codetools_0.2-18 #> [65] biomaRt_2.52.0 reprex_2.0.2 #> [67] XML_3.99-0.11 glue_1.6.2 #> [69] BiocVersion_3.15.2 evaluate_0.17 #> [71] BiocManager_1.30.18 png_0.1-7 #> [73] vctrs_0.4.2 httpuv_1.6.6 #> [75] purrr_0.3.5 assertthat_0.2.1 #> [77] cachem_1.0.6 xfun_0.33 #> [79] mime_0.12 xtable_1.8-4 #> [81] restfulr_0.0.15 later_1.3.0 #> [83] tibble_3.1.8 GenomicAlignments_1.32.1 #> [85] memoise_2.0.1 ellipsis_0.3.2 #> [87] interactiveDisplayBase_1.34.0 ```
LTLA commented 2 years ago

I have no idea. Seems like an AnnotationHub caching problem of some kind. When did this start happening?

Long shot: rebook caches very aggressively via knitr's chunk caching. One possibility is that multiple books are trying to compile the Muraro chapter at the same time, but they all are reading/writing from the same cache, which could cause some corruption. That said, this should not be possible, because the compilation of any given report is protected by multiple filelock locks, and besides, the knitr caching shouldn't have had a chance to save edb anyway.

@Alanocallaghan's $ operator is invalid for atomic vectors is a bit easier to explain. The failure of the Muraro compilation causes extractFromPackage to just emit a "NOT FOUND" string (or something like that, I can't remember) when it can't find the requested object because the compilation didn't complete. Ideally it would throw a more explicit error for easier debugging but I just work with what I get out of knitr::load_cache - maybe we should file a PR upstream.

hpages commented 2 years ago

I just nuked the AnnotationHub and rebook caches on nebbiolo1. We'll see how that goes.

PeteHaitch commented 2 years ago

Thanks Hervé!

PeteHaitch commented 2 years ago

Success! Now building and passing checks in BioC 3.16 as well as BioC 3.15