PATRIC3 / patric3_website

Legacy PATRIC Website (JBoss Portal Version)
MIT License
5 stars 2 forks source link

importing genbank files without translations can result in features without translations #1788

Open olsonanl opened 6 years ago

olsonanl commented 6 years ago

The crash in the proteome comparison tool at the tutorial this week was a result of feature fig|1945512.3.peg.14 not having a translation. That feature derived from the genbank file https://www.ncbi.nlm.nih.gov/nuccore/CP019799.1 which has no translations at all in it. The feature in question is this:

 gene            complement(8925..9241)
                 /locus_tag="B0D95_00035"
                 /pseudo
 CDS             complement(8925..9241)
                 /locus_tag="B0D95_00035"
                 /inference="COORDINATES: similar to AA
                 sequence:RefSeq:WP_007644411.1"
                 /note="frameshifted; Derived by automated computational
                 analysis using gene prediction method: Protein Homology."
                 /pseudo
                 /codon_start=1
                 /transl_table=11
                 /product="hypothetical protein"

Presumably the rest of the features were also called by our own gene callers and thus had translations. (Genome output is in /PATRIC@patricbrc.org/home/Reference Data/Genomes_from_GenBank/Mar2017/GCA_002007605.1/.GCA_002007605.1/GCA_002007605.1.genome)

We should probably add a pass in the annotation service that attempts a translation on any features missing one.

olsonanl commented 5 years ago

This is fixed with the translate_untranslated_proteins pipeline step. Will resolve when this becomes production.