Closed G-Thomson closed 3 years ago
Hi @G-Thomson proActiv is now hosted on Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/proActiv.html). Can you try to install the Bioconductor version, and if you still encounter this error can you post it to the Bioconductor forum and tag proActiv? https://support.bioconductor.org/ Ideally you can post the output from sessionInfo() and what input data you use so that we can reproduce this error, we should then be able to address this. Thanks!
Hi @G-Thomson , Thanks for raising this issue! I've tried creating the annotation object for Arabidopsis and it seems to work for me. GFF used can be found at: ftp://ftp.ensemblgenomes.org/pub/plants/release-48/gff3/arabidopsis_thaliana
> file <- "Arabidopsis_thaliana.TAIR10.48.gff3.gz"
> show(names(GenomeInfoDb::genomeStyles()))
[1] "Arabidopsis_thaliana" "Caenorhabditis_elegans" "Canis_familiaris" "Cyanidioschyzon_merolae"
[5] "Drosophila_melanogaster" "Homo_sapiens" "Mus_musculus" "Oryza_sativa"
[9] "Populus_trichocarpa" "Rattus_norvegicus" "Saccharomyces_cerevisiae" "Zea_mays"
> species <- names(GenomeInfoDb::genomeStyles())[1]
> annotation <- preparePromoterAnnotation(file = file, species = species)
Parsing input file...
Import genomic features from the file as a GRanges object ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Extract exons by transcripts...
Identify overlapping first exons for each gene...
Prepare mapping between transcripts, tss, promoters and genes...
Prepare annotated intron ranges...
Annotating reduced exon ranges...
Prepare promoter coordinates and first exon ranges...
Session Info:
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19041)
Matrix products: default
locale:
[1] LC_COLLATE=English_Singapore.1252 LC_CTYPE=English_Singapore.1252 LC_MONETARY=English_Singapore.1252
[4] LC_NUMERIC=C LC_TIME=English_Singapore.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] proActiv_0.99.2 testthat_2.3.2
loaded via a namespace (and not attached):
[1] colorspace_1.4-1 ellipsis_0.3.1 rprojroot_1.3-2
[4] biovizBase_1.36.0 htmlTable_2.0.1 XVector_0.28.0
[7] GenomicRanges_1.40.0 base64enc_0.1-3 fs_1.4.2
[10] dichromat_2.0-0 rstudioapi_0.11 remotes_2.1.1
[13] bit64_0.9-7 AnnotationDbi_1.50.1 fansi_0.4.1
[16] splines_4.0.2 knitr_1.29 geneplotter_1.66.0
[19] pkgload_1.1.0 Formula_1.2-3 Rsamtools_2.4.0
[22] annotate_1.66.0 cluster_2.1.0 dbplyr_1.4.4
[25] png_0.1-7 compiler_4.0.2 httr_1.4.1
[28] backports_1.1.7 lazyeval_0.2.2 assertthat_0.2.1
[31] Matrix_1.2-18 cli_2.0.2 htmltools_0.5.0
[34] acepack_1.4.1 prettyunits_1.1.1 tools_4.0.2
[37] gtable_0.3.0 glue_1.4.1 GenomeInfoDbData_1.2.3
[40] dplyr_1.0.1 rappdirs_0.3.1 Rcpp_1.0.5
[43] Biobase_2.48.0 vctrs_0.3.2 Biostrings_2.56.0
[46] rtracklayer_1.48.0 xfun_0.15 stringr_1.4.0
[49] ps_1.3.3 lifecycle_0.2.0 ensembldb_2.12.1
[52] devtools_2.3.0 XML_3.99-0.4 zlibbioc_1.34.0
[55] scales_1.1.1 BSgenome_1.56.0 VariantAnnotation_1.34.0
[58] ProtGenerics_1.20.0 hms_0.5.3 parallel_4.0.2
[61] SummarizedExperiment_1.18.2 AnnotationFilter_1.12.0 RColorBrewer_1.1-2
[64] curl_4.3 memoise_1.1.0 gridExtra_2.3
[67] ggplot2_3.3.2 biomaRt_2.44.1 rpart_4.1-15
[70] latticeExtra_0.6-29 stringi_1.4.6 RSQLite_2.2.0
[73] genefilter_1.70.0 S4Vectors_0.26.1 desc_1.2.0
[76] checkmate_2.0.0 GenomicFeatures_1.40.1 BiocGenerics_0.34.0
[79] pkgbuild_1.1.0 BiocParallel_1.22.0 GenomeInfoDb_1.24.2
[82] rlang_0.4.7 pkgconfig_2.0.3 matrixStats_0.56.0
[85] bitops_1.0-6 lattice_0.20-41 purrr_0.3.4
[88] GenomicAlignments_1.24.0 htmlwidgets_1.5.1 bit_1.1-15.2
[91] processx_3.4.3 tidyselect_1.1.0 magrittr_1.5
[94] DESeq2_1.28.1 R6_2.4.1 IRanges_2.22.2
[97] generics_0.0.2 Hmisc_4.4-0 DelayedArray_0.14.1
[100] DBI_1.1.0 pillar_1.4.6 foreign_0.8-80
[103] withr_2.2.0 survival_3.1-12 RCurl_1.98-1.2
[106] nnet_7.3-14 tibble_3.0.3 crayon_1.3.4
[109] BiocFileCache_1.12.0 jpeg_0.1-8.1 progress_1.2.2
[112] usethis_1.6.1 locfit_1.5-9.4 grid_4.0.2
[115] data.table_1.13.0 blob_1.2.1 callr_3.4.3
[118] digest_0.6.25 xtable_1.8-4 openssl_1.4.2
[121] stats4_4.0.2 munsell_0.5.0 Gviz_1.32.0
[124] sessioninfo_1.1.1 askpass_1.1
Let me know if this works for you!
I would like to use this package to study some data generated from from Arabidopsis. However when I run
preparePromoterAnnotation()
I get:Error in extractSeqlevels(species, style) : The style specified by 'UCSC' does not have a compatible entry for the species Arabidopsis_thaliana
Is this because the
getTranscriptRanges()
function (and other functions?) are trying to forceGenomeInfoDb
functions to use the UCSC naming scheme, which Arabidopsis is not included in?Is there a downstream reason UCSC is used or could NCBI or Ensembl conventions be used?