Closed Adrian-Zet closed 1 year ago
@Adrian-Zet You're seeing that error with ensemblRelease = 100
because of a genome build mismatch. Are you using the built-in GRCh38 / hg38 reference? I believe that is built against Ensembl 94 currently.
Here's where the transcriptome is defined: https://github.com/chapmanb/cloudbiolinux/blob/master/ggd-recipes/hg38/transcripts.yaml
Hello, I managed to solve it yesterday by running the genes missing from the given ensemblrelease using http://www.ensembl.org/Homo_sapiens/Tools/IDMapper/Results?tl=TqzBcwAxZ9mZ1BWa-8812266
It seems the correct database was version 95.
Should I update the human genome in bcbio, currently I'm using the built-in hg38/GRCh38. Am I supposed to use a different one?
Also, in the folder results/date-time/kegg-plots I have no figures. I attached the resulting HTML file for easily taking a look at the code and my current results.
Thank you, Adrian.
Hi @Adrian-Zet , no need to update the bcbio hg38 genome -- it's just important that you match the annotation versions in downstream analysis. I'll see if I can make this a little more seamless in bcbioRNASeq in a future update.
I'll take a look at the functional analysis output and get back to you. The functional analysis template runs pathview and you could be running into an issue with that package.
Hi @mjsteinbaugh ,
I have encountered the same issue. The ensembl release should be 94, in the ~/local/share/bcbio/genomes/Hsapiens/hg38/rnaseq
folder, cat version.txt
returned the following information:
Created from: /n/data1/cores/bcbio/rory_kirchner/genome_setups/hg38-build94/rnaseq/hg38.gtf
library(bcbioRNASeq);
bcb <- bcbioRNASeq(uploadDir="/data1/m6a/pmid33961823/gse167075/results/rnaseq/gse167075_rnaseq/final",
interestingGroups=c("phenotype"),level="gene",
organism="Homo sapiens",
genomeBuild = "GRCh38",
ensemblRelease=94L);
flat <- AcidBase::coerceToList(bcb);
saveData(bcb, flat, dir="data")
But when doing functional annotation for the differentially expressed genes, the following error message popped up. I have tried to change ensemblRelease
to 90, 92, 94, 95, 96, 97, 98, 99, 100, 101
, and neither one worked. Can you check for me, please? Thank you in advance!
R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> rmarkdown::render('METTL3_KD_vs_Control_FuncAnno_BP.Rmd')
processing file: METTL3_KD_vs_Control_FuncAnno_BP.Rmd
|.... | 9% (header)
processing file: ./_header.Rmd
|........ | 19% (deseqdataset) Quitting from lines 119-126 (METTL3_KD_vs_Control_FuncAnno_BP.Rmd)
Error in `stop()`:
! Assert failure.
[3] isSubset(c("geneId", "entrezId"), colnames(rowData(dds))) is not TRUE.
Cause: `c("geneId", "entrezId")` has elements not in `colnames(rowData(dds))`: entrezId
Backtrace:
x
1. \-goalie::assert(...)
2. \-AcidCLI (local) stop(...)
3. +-base::do.call(what = rlang::abort, args = args)
4. \-rlang (local) `<fn>`(message = "Assert failure.\n[3] isSubset(c(\"geneId\", \"entrezId\"), colnames(rowData(dds))) is not TRUE.\nCause: `c(\"geneId\", \"entrezId\")` has elements not in `colnames(rowData(dds))`: entrezId")
Execution halted
I also tried mouse data, and it popped up the same error message.
library(bcbioRNASeq);
bcb <- bcbioRNASeq(uploadDir="/data1/rnaseq/results/MD5VDAD/final",
interestingGroups=c("phenotype"),
level="gene",
organism="Mus musculus",
genomeBuild="GRCm38",
ensemblRelease=94);
flat <- AcidBase::coerceToList(bcb);
saveData(bcb, flat, dir="data")
The error message reads:
R version 4.2.3 (2023-03-15) -- "Shortstop Beagle"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> rmarkdown::render('C_vs_M_FuncAnno_BP.Rmd')
processing file: C_vs_M_FuncAnno_BP.Rmd
|.... | 9% (header)
processing file: ./_header.Rmd
|........ | 19% (deseqdataset) Quitting from lines 119-126 (C_vs_M_FuncAnno_BP.Rmd)
Error in `stop()`:
! Assert failure.
[3] isSubset(c("geneId", "entrezId"), colnames(rowData(dds))) is not TRUE.
Cause: `c("geneId", "entrezId")` has elements not in `colnames(rowData(dds))`: entrezId
Backtrace:
x
1. \-goalie::assert(...)
2. \-AcidCLI (local) stop(...)
3. +-base::do.call(what = rlang::abort, args = args)
4. \-rlang (local) `<fn>`(message = "Assert failure.\n[3] isSubset(c(\"geneId\", \"entrezId\"), colnames(rowData(dds))) is not TRUE.\nCause: `c(\"geneId\", \"entrezId\")` has elements not in `colnames(rowData(dds))`: entrezId")
And here is the output of sessionInfo()
:
R version 4.2.3 (2023-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.10
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods base
other attached packages:
[1] AcidGSEA_0.8.8 DESeqAnalysis_0.6.8 basejump_0.16.5 goalie_0.6.9
[5] pathview_1.38.0 enrichplot_1.18.4 clusterProfiler_4.7.1.003 DOSE_3.24.2
[9] DESeq2_1.38.3 withr_2.5.0 stringi_1.7.12 ggnewscale_0.4.8
[13] R.utils_2.12.2 R.oo_1.25.0 R.methodsS3_1.8.2 ensembldb_2.22.0
[17] AnnotationFilter_1.22.0 GenomicFeatures_1.50.4 AnnotationDbi_1.60.2 bcbioRNASeq_0.5.3
[21] SummarizedExperiment_1.28.0 Biobase_2.58.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
[25] IRanges_2.32.0 S4Vectors_0.36.2 BiocGenerics_0.44.0 MatrixGenerics_1.10.0
[29] matrixStats_0.63.0
loaded via a namespace (and not attached):
[1] utf8_1.2.3 tidyselect_1.2.0 RSQLite_2.3.1 htmlwidgets_1.6.2
[5] grid_4.2.3 BiocParallel_1.32.6 scatterpie_0.1.9 AcidPlots_0.5.5
[9] munsell_0.5.0 codetools_0.2-19 DT_0.27 AcidGenerics_0.6.7
[13] colorspace_2.1-0 GOSemSim_2.24.0 pipette_0.10.9 filelock_1.0.2
[17] knitr_1.42 rstudioapi_0.14 SingleCellExperiment_1.20.1 KEGGgraph_1.58.3
[21] tximport_1.26.1 GenomeInfoDbData_1.2.9 polyclip_1.10-4 bit64_4.0.5
[25] farver_2.1.1 downloader_0.4 treeio_1.22.0 vctrs_0.6.2
[29] generics_0.1.3 gson_0.1.0 xfun_0.39 BiocFileCache_2.6.1
[33] R6_2.5.1 graphlayouts_0.8.4 locfit_1.5-9.7 bitops_1.0-7
[37] cachem_1.0.7 fgsea_1.24.0 gridGraphics_0.5-1 DelayedArray_0.24.0
[41] promises_1.2.0.1 BiocIO_1.8.0 scales_1.2.1 vroom_1.6.3
[45] ggraph_2.1.0 gtable_0.3.3 tidygraph_1.2.3 AcidExperiment_0.4.7
[49] rlang_1.1.1 AcidMarkdown_0.2.5 splines_4.2.3 rtracklayer_1.58.0
[53] lazyeval_0.2.2 BiocManager_1.30.20 yaml_2.3.7 reshape2_1.4.4
[57] httpuv_1.6.9 qvalue_2.30.0 syntactic_0.6.6 tools_4.2.3
[61] ggplotify_0.1.0 ggplot2_3.4.2 ellipsis_0.3.2 RColorBrewer_1.1-3
[65] MultiAssayExperiment_1.24.0 Rcpp_1.0.10 plyr_1.8.8 progress_1.2.2
[69] zlibbioc_1.44.0 purrr_1.0.1 RCurl_1.98-1.12 prettyunits_1.1.1
[73] viridis_0.6.2 cowplot_1.1.1 ggrepel_0.9.3 magrittr_2.0.3
[77] data.table_1.14.8 ProtGenerics_1.30.0 AcidCLI_0.2.7 hms_1.1.3
[81] patchwork_1.1.2 mime_0.12 AcidBase_0.6.15 evaluate_0.20
[85] xtable_1.8-4 HDO.db_0.99.1 XML_3.99-0.14 gridExtra_2.3
[89] compiler_4.2.3 biomaRt_2.54.1 tibble_3.2.1 shadowtext_0.1.2
[93] crayon_1.5.2 htmltools_0.5.5 ggfun_0.0.9 later_1.3.0
[97] tzdb_0.3.0 tidyr_1.3.0 geneplotter_1.76.0 aplot_0.1.10
[101] DBI_1.1.3 tweenr_2.0.2 dbplyr_2.3.2 MASS_7.3-59
[105] rappdirs_0.3.3 Matrix_1.5-4 readr_2.1.4 cli_3.6.1
[109] parallel_4.2.3 AcidSingleCell_0.3.5 igraph_1.4.2 pkgconfig_2.0.3
[113] GenomicAlignments_1.34.1 xml2_1.3.4 ggtree_3.6.2 annotate_1.76.0
[117] AcidGenomes_0.5.0 XVector_0.38.0 yulab.utils_0.0.6 stringr_1.5.0
[121] digest_0.6.31 graph_1.76.0 Biostrings_2.66.0 rmarkdown_2.21
[125] AcidPlyr_0.3.10 fastmatch_1.1-3 tidytree_0.4.2 edgeR_3.40.2
[129] restfulr_0.0.15 curl_5.0.0 shiny_1.7.4 Rsamtools_2.14.0
[133] rjson_0.2.21 lifecycle_1.0.3 nlme_3.1-162 jsonlite_1.8.4
[137] viridisLite_0.4.1 limma_3.54.2 fansi_1.0.4 pillar_1.9.0
[141] lattice_0.21-8 KEGGREST_1.38.0 fastmap_1.1.1 httr_1.4.5
[145] GO.db_3.16.0 interactiveDisplayBase_1.36.0 glue_1.6.2 png_0.1-8
[149] Rgraphviz_2.42.0 BiocVersion_3.16.0 bit_4.0.5 ggforce_0.4.1
[153] bcbioBase_0.8.1 blob_1.2.4 org.Hs.eg.db_3.16.0 AnnotationHub_3.6.0
[157] memoise_2.0.1 dplyr_1.1.2 ape_5.7-1
Hi I'm working on an update to bcbioRNASeq that will fix this issue. Sorry for the breaking change!
@Adrian-Zet @jxshi Can you let me know if this draft updated functional analysis template works for you? I'm working on pushing a bcbioRNASeq package update soon.
Best, Mike
OK this should be resolved in the bcbioRNASeq 0.5.4 update, which is processing. If you continue to have issues with the functional analysis template, please reopen!
Best, Mike
I use it on both human and mouse RNA-seq data, the new template works perfectly. Thank you so much! Have a nice day!
Best, Jianxiang
Great to hear!! Thanks for the follow up.
Dear bcbioRNASeq team,
I come back with an error gathered from trying to run the other template of the bcBioRNASeq package. The functional analysis one. This is the error:
This is how I defined the object:
The Differential Expression Template ran flawlessly and generated the output deseq.rds correctly.
I also tried creating the object using :
But this approach gave me the following error:
I will also add project YAML (changed to .txt for uploading here) for helping out with the debugging. I actually don't know how to find out what version of ensmblRelease was used by bcbio.
project-summary (2).txt