Closed clarkzor closed 10 months ago
This is a duplicate from https://github.com/Bioconductor/BSgenome/issues/75
BTW your seed file does not contain xenTro
so I'm not sure how you get this error about xenTro
not being registered in GenomeInfoDb. Note that there's actually no assembly named xenTro
at NCBI (see https://www.ncbi.nlm.nih.gov/assembly/?term=xenTro) so that's not something we would be able to register anyway.
Finally, and FWIW, note that assembly UCB_Xtro_10.0
is already registered in GenomeInfoDb:
> library(GenomeInfoDb)
> registered_NCBI_assemblies("tropicalis")
organism assembly date extra_info
1 Xenopus tropicalis UCB_Xtro_10.0 2019/11/14 strain:Nigerian
2 Xenopus tropicalis ASM1336827v1 2020/06/23 strain:Nigerian
assembly_accession circ_seqs
1 GCF_000004195.4 MT
2 GCA_013368275.1
> Seqinfo(genome="UCB_Xtro_10.0")
Seqinfo object with 167 sequences (1 circular) from UCB_Xtro_10.0 genome:
seqnames seqlengths isCircular genome
Chr1 217471166 FALSE UCB_Xtro_10.0
Chr2 181034961 FALSE UCB_Xtro_10.0
Chr3 153873357 FALSE UCB_Xtro_10.0
Chr4 153961319 FALSE UCB_Xtro_10.0
Chr5 164033575 FALSE UCB_Xtro_10.0
... ... ... ...
Sca152 786 FALSE UCB_Xtro_10.0
Sca153 755 FALSE UCB_Xtro_10.0
Sca154 748 FALSE UCB_Xtro_10.0
Sca155 593 FALSE UCB_Xtro_10.0
Sca156 582 FALSE UCB_Xtro_10.0
Anyways, all this becomes irrelevant if you forge the BSgenome package as suggested here: https://github.com/Bioconductor/BSgenome/issues/75#issuecomment-1912753142
sessionInfo():
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 23.10
Matrix products: default
BLAS: /home/hpages/R/R-4.3.0/lib/libRblas.so
LAPACK: /home/hpages/R/R-4.3.0/lib/libRlapack.so; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/Los_Angeles
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GenomeInfoDb_1.38.5 IRanges_2.36.0 S4Vectors_0.40.2
[4] BiocGenerics_0.48.1
loaded via a namespace (and not attached):
[1] compiler_4.3.0 GenomeInfoDbData_1.2.11 RCurl_1.98-1.14
[4] bitops_1.0-7
Hello, I am trying to utilize the "forgeBSgenomeDataPkg("../Xenopustropicalis_Seedfile.dcf")" to create a new package from a seed file, however I receive the error message:
Error in .make_Seqinfo_from_genome(genome) : "xenTro" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package)
This is what my seed file looks like:
Package: BSgenome.Xtrop.NCBI.UCB_Xtro_10.0 Title: Full genomic sequences for Xenopus Tropicalis VIA NCBI Description: For Multiomic Analysis Version: 1.0.0 organism: Xenopus tropicalis common_name: Frog provider: NCBI genome: UCB_Xtro_10.0 release_date: Nov. 2019 source_url: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000004195.4/ organism_biocview: Xenopus_tropicalis BSgenomeObjname: Xtrop seqs_srcdir: /Users/coron/OneDrive/Desktop/10xGenomics/ seqfile_name: UCB_Xtro_10.0_genome.fasta
could you please add the Xenopus tropicalis genome to the GenomeInfoDb package so that I could use this function.
Also, should I be able to run this using a single genomic fasta file that contains information about all the chromosomes, or do I have to actually split up the chromosomes into their own individual fasta files?
Thank you for your help.