veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
200 stars 68 forks source link

Understanding RELAX site proportions #1645

Closed NatJWalker-Hale closed 10 months ago

NatJWalker-Hale commented 10 months ago

Hi guys,

I'm looking at some RELAX fits from hyphy v2.5.53 where all the genes appear to have the same p0, p1, p2 between reference and test partitions under the RELAX alternative model, with significant results only varying in w0, w1, w2. I had a couple of questions about this.

  1. Is this legitimate? It's making me worry that p0, p1, p2 have been constrained to be equal, but this is not the case from the method description as far as I can tell.
  2. If legitimate, can we infer anything about the properties of the alignment leading to that result? In other words, does it imply anything particular about the site patterns between test and reference groups, compared to a hypothetical result where p vary and w are constant, or both p and w vary?

Thanks a lot,

Nat

spond commented 10 months ago

Dear @NatJWalker-Hale,

  1. You are correct in that p are constrained to be the same between Test and Reference. It is as intended by the original paper (see Fig 1, also attached here). A different model (Partitioned Exploratory) does not impose this constraint.

  2. In "live" data without constraints, you would never (well, almost never) expect the estimates to coincide. Here they are forces to be the same by design.

Best, Sergei

image
NatJWalker-Hale commented 10 months ago

Dear @spond,

Ah, thanks so much for clarifying! I had misread the caption of Fig. 5 in the paper and interpreted those as results from the RELAX alternative. As far as presenting results in publication, is it advisable to present results from the partitioned descriptive fit (alongside p-values from the comparison of RELAX null and alternative) as you do in Fig. 5 of Wertheim et al.?

Thanks again,

Nat

spond commented 10 months ago

Dear @NatJWalker-Hale,

I would view the partitioned descriptive fit almost like a "normality check" for t-tests or ANOVA. If the partitioned descriptive model has a much better fit to the data than the RELAX alternative model (measured by a sufficiently large Δ AIC-c, say 10), then you might infer that the distributional assumptions (same proportions) of the RELAX tests may not be the most appropriate for the data at hand, so the RELAX test should be interpreted with caution.

Best, Sergei

NatJWalker-Hale commented 10 months ago

Okay, understood, many thanks for clarifying!