veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
200 stars 68 forks source link

meme output "Most common codon substitutions at this site" including Ns #1686

Closed axeljen closed 2 months ago

axeljen commented 5 months ago

Hi,

I have a question about the output of HyPhy meme. In the output summary table, in the 'most common codon substitutions at this site', it seems that 'N' is listed as substitutions. For example, this is what I get in this column for one of my analyzed sites: [2]cgA>cgN|[1]cGa>cAa,CGa>NNa,Gga>Cga

As I understand it it should be fine to keep Ns in the alignment, so I just got a little confused about seeing this in the output and wanted to make sure that the Ns indeed doesn't contribute to the detection of episodic selection?

Thanks and sorry if I'm missing something obvious!

Best, Axel

spond commented 5 months ago

Dear @axeljen,

Yes, it's totally find to keep N (or another ambiguous character) in the alignment. But unless you have a full "unknown" codon (NNN), HyPhy will handle such characters as partially missing data.

For example, CGa>NNa (CGA>AAA, CGA>ACA and so on) can be viewed as a 16-way ambiguity that contributes some (but not that much) information. Generally most of the signal will come from resolved bases.

HTH, Sergei

github-actions[bot] commented 3 months ago

Stale issue message