Bioconductor / GenomeInfoDb

Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
https://bioconductor.org/packages/GenomeInfoDb
31 stars 13 forks source link

Changing seqlevelsStyle of BSgenome fails because of multiple genomes #14

Closed csoneson closed 4 years ago

csoneson commented 4 years ago

Hi,

I think this is related to https://github.com/Bioconductor/GenomeInfoDb/issues/12 and https://stat.ethz.ch/pipermail/bioc-devel/2020-July/016966.html (seqlevelsStyle now being able to rename contigs and scaffolds). Trying to convert the seqlevelsStyle of the UCSC hg19 BSgenome (same for hg38) fails:

library(BSgenome.Hsapiens.UCSC.hg19)
seqlevelsStyle(Hsapiens) <- "NCBI"

gives

Error in .normarg_genome(value, seqnames(x)) : 
  when 'genome' vector is named and contains more than one distinct
  value, it cannot have duplicated names

For completeness, the reason I noticed this is that it causes the SGSeq package (on which one of my packages depends) to fail during the vignette building (http://bioconductor.org/checkResults/3.12/bioc-LATEST/SGSeq/merida1-buildsrc.html), and I guess I'm trying to figure out where it should be fixed πŸ˜ƒ

Thanks!

Session info ``` > BiocManager::version() [1] β€˜3.12’ > BiocManager::valid() [1] TRUE R version 4.0.2 (2020-06-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS High Sierra 10.13.6 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils [7] datasets methods base other attached packages: [1] BSgenome.Hsapiens.UCSC.hg19_1.4.3 BSgenome_1.57.4 [3] rtracklayer_1.49.3 Biostrings_2.57.2 [5] XVector_0.29.3 GenomicRanges_1.41.5 [7] GenomeInfoDb_1.25.8 IRanges_2.23.10 [9] S4Vectors_0.27.12 BiocGenerics_0.35.4 loaded via a namespace (and not attached): [1] rstudioapi_0.11 knitr_1.29 [3] zlibbioc_1.35.0 GenomicAlignments_1.25.3 [5] BiocParallel_1.23.2 lattice_0.20-41 [7] tools_4.0.2 grid_4.0.2 [9] SummarizedExperiment_1.19.6 Biobase_2.49.0 [11] xfun_0.15 matrixStats_0.56.0 [13] crayon_1.3.4 Matrix_1.2-18 [15] GenomeInfoDbData_1.2.3 bitops_1.0-6 [17] RCurl_1.98-1.2 DelayedArray_0.15.7 [19] compiler_4.0.2 Rsamtools_2.5.3 [21] XML_3.99-0.4 ```
hpages commented 4 years ago

Thanks Charlotte for the report. This should be fixed in BSgenome 1.57.5 (see https://github.com/Bioconductor/BSgenome/commit/781c61d35049149e224eccb5f50cf80804a50ebd).

Best!

csoneson commented 4 years ago

Thanks HervΓ©! I can confirm that the code above works with BSgenome 1.57.5 πŸ‘