Open pailloufat-stack opened 3 years ago
Hi @pailloufat-stack,
This error could come from how emapper2gbk uses your GFF file. In default mode, emapper2gbk searches for gene in the GFF using the region then extracts the CDS associated to these genes. Like in this example (where emapper2gbk will extract the cds_1 using the Parent=gene_1
relationship):
##gff file
region_1 RefSeq region 1 12642 . + . ID=region_1
region_1 RefSeq gene 1 2445 . - . ID=gene_1
region_1 RefSeq CDS 1 2445 . - 0 ID=cds_1;Parent=gene_1
But in your GFF file there is only CDS so it fails to extract them as there is no gene feature.
There is an option -gt cds_only
for when you have only CDS inside your CDS.
Can you add it to your command and see if it solves this issue?
Hi @ArnaudBelcour ,
Thanks for the answer. Unfortunately, I ran the command with -gt cds_only
but I got the same genbank file.
Hi @pailloufat-stack,
So it seems that there is another issue. I have tested the example lines of your GFF using gffutils
and the package is not able to parse correctly your input file. This is strange because normally GFF files from Prodigal can be read with this package (I have tested it recently).
Have you modified the GFF file created by Prodigal? One possible issue could be that the tab-separated element are not well written. This could prevent gffutils from parsing correctly the input GFF and its columns.
Description
Hello,
I did a annotation with eggnog-mapper on my new Pacbio assembly. The annotation and the output files are (look) OK. What I want to do is convert my GFF annotation file into a GenBank file. That's why I use emapper2gbk.
I have a single chromosome assembly
>CP019962.1_RagTag
, (I only show the beginning of the files) my gene predictions from proidgal :my annotation.tsv file :
and my gff file :
I get a GenBank file but without the CDS annotations :
What I Did
I ran :
emapper2gbk genomes --fastanucleic EggMapper_Annot_Microbial_Assembly_v2_RagTag_Scaffolded.emapper.fna --fastaprot EggMapper_Annot_Microbial_Assembly_v2_RagTag_Scaffolded.emapper.genepred.faa --out test.gbk --gff EggMapper_Annot_Microbial_Assembly_v2_RagTag_Scaffolded.emapper.gff --annotation EggMapper_Annot_Microbial_Assembly_v2_RagTag_Scaffolded.emapper.annotation.tsv -n "Firmicutes"
Do you have an idea to get a correct Genbank file?
Thanks