EDIT: I noticed this morning in the closed issues that this has already been asked and all I need is to put seqnames into the seedfile. Now I have another error! haha, but this one I think is just a syntax error on my part.
Hello,
From reading on biostars, github, other sources, I think I understand now that the assembly needs to be registered. I have also read that it is best to convert the fasta to 2bit.
2bit is not an option for me as the genome I am working with is too big. I need to find a way around 2bit and I hope that you can help me register the genome. All of the resources do appear to be available. For instance I get data when I run getChromInfoFromNCBI("GCA_002915635.3") so I am hopeful that with a little more help I can make this work.
If it is possible to forge assemblies that are not found in registered_NCBI_assemblies(), then perhaps I could have someone look at my seed?
The seed is:
Package: BSgenome.Amexicanum.NCBI.ambMex60DD
Title: Ambystoma Mexicanum (Axolotl) full genome (Schloissnig version V6.0-DD)
Description: A chromosome-scale assembly of the axolotl genome as provided by Schloissnig (v6.0-DD, April. 2021)
Version: 1.0.0
organism: Ambystoma mexicanum
common_name: axolotl
genome: GCA_002915635.3
provider: Schloissnig
release_date: April, 2021
source_url: https://www.ncbi.nlm.nih.gov/assembly/GCA_002915635.3
organism_biocview: Ambystoma_mexicanum
seqs_srcdir: $SCRATCH/GCA_002915635.3
The error is:
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'seqlevels': "GCA_002915635.3" is not a registered -NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package)
Any help would be great, thank you.
Edit to add: Can I use multiple 2bit files to forge a BSgenome?
EDIT: I noticed this morning in the closed issues that this has already been asked and all I need is to put seqnames into the seedfile. Now I have another error! haha, but this one I think is just a syntax error on my part.
Hello, From reading on biostars, github, other sources, I think I understand now that the assembly needs to be registered. I have also read that it is best to convert the fasta to 2bit.
2bit is not an option for me as the genome I am working with is too big. I need to find a way around 2bit and I hope that you can help me register the genome. All of the resources do appear to be available. For instance I get data when I run getChromInfoFromNCBI("GCA_002915635.3") so I am hopeful that with a little more help I can make this work.
The assembly is for Ambystoma mexicanum https://www.ncbi.nlm.nih.gov/assembly/GCA_002915635.3
If it is possible to forge assemblies that are not found in registered_NCBI_assemblies(), then perhaps I could have someone look at my seed?
The seed is: Package: BSgenome.Amexicanum.NCBI.ambMex60DD Title: Ambystoma Mexicanum (Axolotl) full genome (Schloissnig version V6.0-DD) Description: A chromosome-scale assembly of the axolotl genome as provided by Schloissnig (v6.0-DD, April. 2021) Version: 1.0.0 organism: Ambystoma mexicanum common_name: axolotl genome: GCA_002915635.3 provider: Schloissnig release_date: April, 2021 source_url: https://www.ncbi.nlm.nih.gov/assembly/GCA_002915635.3 organism_biocview: Ambystoma_mexicanum seqs_srcdir: $SCRATCH/GCA_002915635.3
The error is: Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 'seqlevels': "GCA_002915635.3" is not a registered -NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb package)
Any help would be great, thank you.
Edit to add: Can I use multiple 2bit files to forge a BSgenome?