Closed balathumma10 closed 1 year ago
Here is my seed file Package: BSgenome.Egrandis.JGI.v2.0 Title: Full genome sequence for Eucalyptus grandis Description: Full genome sequence of Eucalyptus grandis v2.0 provided by JGI Version: 2.0 organism: Eucalyptus grandis common_name: rose gum provider: phytozome genome: Eucalyptus grandis v2.0 release_date: 2014/12 release_name: JGI source_url: https://phytozome-next.jgi.doe.gov/info/Egrandis_v2_0 BSgenomeObjname: Egrandis seqnames: 1:11 seqfiles_prefix: Egv2.0_chr seqfiles_suffix: .fasta seqs_srcdir: /media/bala/Data21/crispr/seqs_srcdir
I have split the genome fasta file into individaul chromosome and scaffold level fast files. Still I get this error.
Error in .make_Seqinfo_from_genome(genome) : "Eucalyptus grandis v2.0" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package) In addition: Warning message: In forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, destdir = destdir, : field 'release_name' is deprecated
Hi @balathumma10,
Unfortunately we cannot register assemblies from JGI in the GenomeInfoDb package at the moment. Anyways, registration is not required for forging a BSgenome data package. For assemblies not registered in GenomeInfoDb, you need to do one of the following:
seqnames:
field (like you did).Egrandis_297_v2.0.fa.gz
) to the 2-bit format and use the seqfile_name:
field to specify the name of the .2bit
file. When using a .2bit
file, 1. is no longer needed. Note that using a .2bit
file will also result in a BSgenome data package that is slightly smaller and more performant. Converting from FASTA to .2bit
is generally easy. See issue #26 for some guidance.Finally, whether you choose 1. or 2., you also need to specify which sequences are circular in the circ_seqs:
field. Note that this is mandatory, even if there are no circular chromosomes (in which case circ_seqs
should be set to character(0)
).
Hope this helps, H.
Thank you. I have used a version of the genome available in the NCBI and converted it to a .2bit format. I am able to generate BSgenome using the forgeBSgenome package.
Sounds good. Just to be clear, there's no "forgeBSgenome package". I guess you meant you used the forgeBSgenomeDataPkg()
function from the BSgenome package.
Yes, my bad. I meant forgeBSgenomeDataPkg function of BSgenome package.
Hi, I am trying to use forgebsgenome to develop bsgenome for eucalyptus grandis genome which is provided by JGI. When I tried to use the command forgeBSgenomeDataPkg with the seed file prepared I get the following error. Could you please help solve this. Thanks.
Error in .make_Seqinfo_from_genome(genome) : "v2.0" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package)
Error in .make_Seqinfo_from_genome(genome) : "Egrandis_297_v2.0" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package) In addition: Warning messages: 1: In forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, destdir = destdir, : field 'provider_version' is deprecated in favor of 'genome' 2: In forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, destdir = destdir, : field 'release_name' is deprecated