veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
200 stars 68 forks source link

MEME Output Interpretation #1691

Closed oscarwallnoefer closed 4 months ago

oscarwallnoefer commented 4 months ago

Hi,

We are trying to understand which sites and branches are under selection in our phylogenetic tree. However, we are failing to interpret the output from MEME. How can we interpret the "0" in #branches? Why can't we set the EBF threshold on DataMonkey when we load our .log output file?

Thank you for your time,

Oscar

spond commented 4 months ago

Dear @oscarwallnoefer,

Not sure I understand your question. Please load the .json output files (which you can get from HyPhy or Datamonkey) into https://observablehq.com/@spond/meme for the most recent visualization options.

If you are asking why a particular site has a significant p-value without any individual branch showing up as having high EBF, that's quite common and simply refers to the case of "diffuse" support, i.e. no individual branch is highly influential, but taken together, they are. Alternatively, for sites where nearly all the ω weight is allocated to ω > 1, EBFs are not meaningful.

Rememeber, that EBF is only an exploratory tool -- the statisitcally "principled" output is the p-value. EBF simply gives you a way to expore where the support for the p-value comes from, but it's not very precise.

Here's an example

image

Codon 50 (diffuse support)

image

Codon 96 (single 3x substitution is the source of signal)

image

Best, Sergei

oscarwallnoefer commented 4 months ago

Dear @spond, Thank you for the quick and excellent response, it was exactly what we were trying to understand!