chasewnelson / SNPGenie

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data
GNU General Public License v3.0
106 stars 37 forks source link

mid-sequence STOP codon in the sequence #41

Open jsan4christ opened 3 years ago

jsan4christ commented 3 years ago

Hi Chase,

Thanks for the useful tool,

I see this warning,

WARNING: Please be aware that there is a mid-sequence STOP codon in the sequence GU280_gp01|13483.

Please check your annotations for: (1) incorrect frame; or (2) incorrect starting or ending coordinates.

A premature STOP codon may also indicate a pseudogene, for which piN vs. piS analysis may not be appropriate.

Please advise how to deal with,

singing-scientist commented 3 years ago

Greetings! Thanks for using SNPGenie. This warning means that the GTF file you've provided includes coordinates for a gene, named GU280_gp01|13483, that has a STOP codon (TAA, TAG, or TGA) in the middle of its sequence (i.e. before the last codon). This could indicate an annotation error (i.e. the GTF or the reference sequence, or both, are wrong). If it's NOT an error, then typically a natural selection analysis would not be appropriate for the 3'-proximal part of the gene that follows the premature STOP codon — in principle that portion of the gene wouldn't be translated and therefore would not be subject to selection. Let me know if that helps!