veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
210 stars 69 forks source link

Relative Rates methodology #840

Closed nmathers closed 5 years ago

nmathers commented 6 years ago

Hi,

I'm interested in comparing evolutionary rates between two individual phylogenies. Each of the two phylogenies are gene families with a shared common ancestor but are too divergent to align in a single data set for phylogenetic analysis.

Is it possible to do such a relative rate comparison using the tools and models built into HyPhy? (Relative rates or ratios)

Thanks for any suggestions.

spond commented 6 years ago

Dear @nmathers,

The traditional relative ratio test from 1994 can be used, sure, assuming the phylogenies are the same for both genes. You can find the corresponding analysis in

     /HYPHY 2.3.12.20180730beta(MP) for Darwin on x86_64\     
***************** TYPES OF STANDARD ANALYSES *****************

    (1) Selection Analyses
    (2) Evolutionary Hypothesis Testing
    (3) Relative evolutionary rate inference
    (4) Coevolutionary analysis
    (5) Basic Analyses
    (6) Codon Selection Analyses
    (7) Compartmentalization
    (8) Data File Tools
    (9) Miscellaneous
    (10) Model Comparison
    (11) Kernel Analysis Tools
    (12) Molecular Clock
    (13) Phylogeny Reconstruction
    (14) Positive Selection
    (15) Recombination
    (16) Selection/Recombination
    (17) Relative Rate
===>    (18) Relative Ratio
    (19) Substitution Rates

 Please select type of analyses you want to list (or press ENTER to process custom batch file):

Best, Sergei

nmathers commented 6 years ago

What do you mean by "assuming the phylogenies are the same for both genes?

The two phylogenies contain identical taxa but do contain different numbers of paralogs within taxa between the two gene trees. I've run into problems executing the data files. Perhaps my data does not follow the assumptions of the relative ratio test?

Thanks!

spond commented 6 years ago

Dear @nmathers,

The relative ratio test assumes you have have two identical topologies (typically from two genes) and test the following pair of hypotheses

Halternative : branch lengths of the two trees are independent (no correlation in lineage specific rates) H0 : branch lengths are proportional between the two topology (uniform acceleration / slow down in one gene vs another)

Hope this helps. What hypothesis are you trying to test?

Best, Sergei

nmathers commented 6 years ago

I am interesting in testing whether the rate of one gene tree is increased relative to the other gene tree. So I am attempting to test what you have described above.

However, the topology of the two trees differ due to lineage specific gene duplications that are only present in one gene or the other thus altering the tree topologies.

Perhaps there is a simple global rate test for this scenario that I’m not aware of?

Thanks for the help.

spond commented 6 years ago

Dear @nmathers,

If the two topologies are mostly shared, then you can run the test on the shared branches, and propose a sensible set of rules on how to handle rates on "orphan" branches. One possibility is that you don't impose any constraints on branches that are only present in one of two trees.

Best, Sergei

nmathers commented 5 years ago

How do you designate "orphan" branches in the HyPhy input files?

Do you know of any test or program to compare the rate of evolution in two phylogenies with different topologies? I have two phylogenies of two divergent genes with an ancient shared common ancestor. So, the topologies vary substantially but I am interested in a rough comparison of their global rates.

Thanks, Nick

spond commented 5 years ago

Dear @nmathers,

There are a few things you could do.

First, because you have coding data, you could run RELAX analyses on your trees, to compare selective regimes between the two divergent genes. We did something like that in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5391457/ (see Table 3 and surrounding text).

Second, you could devise test which compares total tree lengths (I assume you don't have time-resolved phylogenies); in other words you could check if the total (or per branch) length of gene tree 1 is different from the total (or per branch) length of gene tree 2. This test can be run on nucleotide data.

Best, Sergei

nmathers commented 5 years ago

Thanks for the suggestions! This helps a lot.

nmathers commented 5 years ago

Dear @spond ,

I took your advice and ran RELAX on my two trees. I recieved the result I expected with relaxed selection in one tree but not the other. However, the tree that was NOT significant for relaxation yielded an LRT of -306. Is a negative LR feasible within the parameters of the RELAX analysis?

https://www.datamonkey.org/relax/5bc3e4efed66d8724a8b921f

Thank you again!

spond commented 5 years ago

Dear @nmathers,

This is a convergence problem, which typically indicates some issues with the data / model (e.g. https://github.com/veg/hyphy/issues/779)

I have made some fixes (described in https://github.com/veg/hyphy/issues/817), some of which can be accesses at staging.datamonkey.org (pre-release version of datamonkey).

Let me know if this helps.

Best, Sergei