tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
834 stars 226 forks source link

no names for CDS #497

Open valery-shap opened 4 years ago

valery-shap commented 4 years ago

Hello,

I'm using annotation program for the first time so I'm sorry if my question is obvious. I have contig file with plasmid and .gb file with reference(--proteins). Not every CDS feature has gene feature in this reference. So when i try to check features in SnapGene(add .gff3 file), I see the names of features that have gene feature in the reference. All CDS i see only with locus tag name(that was default). I'd like to show names of products of CDS. Am i doing smth wrong? Suppose that nobody does it manually and prints names of products instead of locus names. I tried to annotate without reference file and i had the same result. And tried to use --rawproducts attach 2 logs files and printscreen from snapgene I have .err file also, but github doesn't support that file type.

A lot of thanks, Valery

test.log 17_555_r.log with_proteins_wo_rawproduct

tseemann commented 4 years ago

Would this option help?

--addgenes         Add 'gene' features for each 'CDS' feature (default OFF)
valery-shap commented 4 years ago

with_add_gene seems that i have the second repeating features with the same locus tag names In the reference i have Gene feature only for some genes and they are showed on the picture. The problem is with features that in the reference are showed only like CDS. I'd like to showe product names instead locus tag names.

valery-shap commented 4 years ago

prokka --prefix test --genus Escherichia --species coli --kingdom Bacteria --gcode 11 --usegenus contig.fasta the command i used without reference and the image is the same

valery-shap commented 4 years ago

prokka --proteins reference.gb --rawproduct --prefix test contig.fasta and this

valery-shap commented 4 years ago

in snapgene i import .gff3 file Reference is https://www.ncbi.nlm.nih.gov/nuccore/KX032520

The cds feature looks like: 393..518 /codon_start=1 /transl_table=11 /product="hypothetical protein" /protein_id="ANC48651.1" /translation="MIDDVCFSLRVIIIGGQIYGKEPSSLLVLLSYPQFLNIRTH"

.gff3

NODE_3 Prodigal:002006 CDS 1159 3906 . + 0 ID=JIOBDGPA_00003;Name=virB4;gene=virB4;inference=ab initio prediction:Prodigal:002006,similar to AA sequence:UniProtKB:Q9RPY1;locus_tag=JIOBDGPA_00003;product=Type IV secretion system protein virB4 NODE_3 Prodigal:002006 CDS 3918 4634 . + 0 ID=JIOBDGPA_00004;inference=ab initio prediction:Prodigal:002006;locus_tag=JIOBDGPA_00004;product=hypothetical protein

so some proteins don't have Name and this positions is shown like locus tag in Snapgene