Bioconductor / BSgenome

Software infrastructure for efficient representation of full genomes and their SNPs
https://bioconductor.org/packages/BSgenome
7 stars 9 forks source link

Error occured when forge Axolotl genome by BSgenome #32

Closed xiangyupan closed 2 years ago

xiangyupan commented 2 years ago

Hi Hervé @hpages I want to construct the BSgenome of Axolotl genome for single-cell ATAC-seq analysis. I 've carefully read your comments in the URL: https://github.com/Bioconductor/BSgenome/issues/28 Then I followed your steps for spliting the huge genome to single fasta file and create the seed file the same with your file. But when I ran the command, an error occureed as following: > forgeBSgenomeDataPkg("/home/wangyu/pxy/Axolotl/BSgenome/BSgenome.Amexicanum.NCBI.ambMex60DD-seed") Creating package in ./BSgenome.Amexicanum.NCBI.ambMex60DD Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'seqlengths': "AmbMex60DD" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package)

I paste my seed file in the below:

Package:BSgenome.Amexicanum.NCBI.ambMex60DD Title: Full genome sequences for Ambystoma mexicanum (AmbMex60DD) Description: Full genome sequences for Ambystoma mexicanum (Axolotl) as provided by NCBI (assembly AmbMex60DD, assembly accession GCA_002915635.3) and stored in Biostrings objects. Version: 1.0.0 organism: Ambystoma mexicanum common_name: Axolotl genome: AmbMex60DD provider: NCBI release_date: 2021/04/01

source_url: https://www.ncbi.nlm.nih.gov/assembly/GCA_002915635.3

organism_biocview: Ambystoma_mexicanum BSgenomeObjname: Amexicanum seqnames: paste0("chr", rep(1:14, each=2), c("p", "q")) circ_seqs: character(0) SrcDataFiles: GCA_002915635.3_AmbMex60DD_genomic.fna.gz from https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/002/915/635/GCA_002915635.3_AmbMex60DD/ PkgExamples: genome$chr1p # same as genome[["chr1p"]] seqs_srcdir: /home/wangyu/pxy/Axolotl/sc-ATAC/fasta seqfiles_suffix: .fna ondisk_seq_format: rds

And the path to fasta folder (/home/wangyu/pxy/Axolotl/sc-ATAC/fasta) include 28 fna files of axolotl chromosomes. I have browsed other similar issues posted on the Github, but I still have no idea. Looking forward to your reply and help. All the best, Pan

hpages commented 2 years ago

Are you using the current version of Bioconductor i.e. BioC 3.15? The ambMex60DD assembly was registered in the GenomeInfoDb package in March 2022 so is only supported starting with BioC 3.15.

BiocManager::version() to see what version you are using. BioC 3.15 requires R 4.2.

H.

xiangyupan commented 2 years ago

Hi H. Thanks for your reply. The axolotl genome has been successfully forged under the R 4.2 envs. Although there is an error (LaTeX errors when creating PDF version) occurred when running R CMD check myspecies.tar.gz, it goes through when running R CMD INSTALL myspecies.tar.gz. Hope other person would see my exprience. Then I will close this issue.

Pan