veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
211 stars 69 forks source link

RELAX - intensified selection on pseudogene branches? #582

Closed MareikeJaniak closed 7 years ago

MareikeJaniak commented 7 years ago

Good afternoon,

I am using RELAX to look at a gene across different primate species. The gene is pseudogenized in some primate species and functional in others. Similar to the bat opsin gene example that was used in the RELAX publication (Wertheim et al. 2015), the first test I ran was with pseudogenes as test branches and the functional genes as reference branches. I was expecting the pseudogene branches to be under relaxed selection, but RELAX found intensified selection (k = 1.88), p = 1.63798582475394e-8, LR = 31.88.

This seems counterintuitive! It also looks like the omega distributions are quite different under the alternative model vs. the partitioned exploratory model, with the partitioned exploratory model looking more like the result that I expected (pseudogene omega values pushed toward neutrality).

How do I interpret these results? http://test.datamonkey.org/relax/597a5a818c1240fb1a89b0c2#

Thanks!

spond commented 7 years ago

Dear @MareikeJaniak,

It would seem that the signal of intensification comes from the fact that your pseudogene branches seem to have a subset of sites with dN/dS higher that the functional genes. For RELAX, all selection modes have to move closer to 1 in order to conform to the relaxation pattern. Generally, even a few sites under stronger selection will "pull" the estimate high. In other words, relaxation of selection can happen in many ways, and what RELAX is testing for is one particular way for it to occur.

There's another odd thing too -- if you look at the "Tree" plot with the General Descriptive model, you will notice that the three lineages at the bottom of the plot (outgroups?) both have very long branches and very strong selective regimes; this extreme level of heterogeneity wishing the "background" set of branches is also something that could bias RELAX.

Best, Sergei

MareikeJaniak commented 7 years ago

Dear Sergei,

thank you so much for your quick and helpful response!

The three extremely long branches are definitely peculiar (Tupaia is meant to be the outgroup), but based on the gene I am looking at it is actually not surprising to see evidence of strong selective regimes in these lineages. It does seem like they could be obscuring differences in the rest of the dataset, though.

Thanks again for shedding some light on this for me!

Best, Mareike