veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
222 stars 69 forks source link

Inquiry About Relax Analysis Results #1760

Open aaannaw opened 3 days ago

aaannaw commented 3 days ago

Dear author, I am conducting a RELAX analysis on 13,057 orthologous genes and have encountered a question regarding the interpretation of the results. Below, I outline the details of my analysis: hyphy relax --alignment /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/07.deletestopcodon/evm.model.ptg000001l.100.fa --tree /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/00.tree/evm.model.ptg000001l.100.phy.treefile.mark --test FG2 --reference FG1 --output /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000001l.100.json > /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/13.hyphy/02.relax/01.relax.out/evm.model.ptg000001l.100.relax.log 2>&1; I used the unrooted tree and the tree is: (Fan{FG1},((((((((Hcr,Cgu),(Tsw,Pty)),Hgl{FG1}),(Gca{FG2},Bsu{FG2})),Cho{FG1}),Fme{FG1}),Fda{FG1}),Fdm{FG1}),Fmi{FG1}); I specified --test FG2 and --reference FG1. The results indicate that the test branches (FG2)show intensified selection (K > 1, P < 0.05).

My question is as follows:

1.If the test branches show intensified selection, can I interpret this as the reference branches (FG1) showing relaxed selection? 2.Would the results remain consistent if I reversed the designations, using --test FG1 and --reference FG2? I would greatly appreciate any insights or suggestions you may have regarding the interpretation of these results.

Thank you for your time and guidance.

Best regards, Na Wan

spond commented 2 days ago

Dear @aaannaw,

  1. RELAX makes always a relative statement. So if FG1 is intensified relative to FG2 then FG2 is relaxed relative to FG1.
  2. Generally, this should be the case. However, RELAX may have convergence issues on smaller datasets or small test/reference sets of branches, especially if the model is too complex for the dataset, or there are other stability issues (ω that is too high for example)

As an example (this is the dataset from Figure 4E in the RELAX paper), attached here as well. Please note that --models Minimal in most cases improves stability. This is because the general descriptive model may be biased by one or two outlier branches. Using test and reference as defined, K=0.17, reversing them gives K=4.84 (so not quite 1/K but close) and significant LRT results.

$hyphy relax --alignment tests/data/Fig4E.nex --reference R --test T --models Minimal

...

### Fitting the alternative model to test K != 1
* Log(L) = -4984.57, AIC-c = 10132.56 (81 estimated parameters)
* Relaxation/intensification parameter (K) =     0.17
* The following rate distribution was inferred for **test** branches

|          Selection mode           |     dN/dS     |Proportion, %|               Notes               |
|-----------------------------------|---------------|-------------|-----------------------------------|
|        Negative selection         |     0.627     |    8.545    |                                   |
|        Negative selection         |     0.681     |   91.427    |                                   |
|      Diversifying selection       |     1.799     |    0.029    |                                   |

...
----
## Test for relaxation (or intensification) of selection [RELAX]
Likelihood ratio test **p =   0.0000**.
>Evidence for *relaxation of selection* among **test** branches _relative_ to the **reference** branches at P<=0.05
----

....

$hyphy relax --alignment tests/data/Fig4E.nex --reference T --test R --models Minimal

...
### Fitt
[Fig4E.nex.zip](https://github.com/user-attachments/files/17831635/Fig4E.nex.zip)
ing the alternative model to test K != 1
* Log(L) = -4984.75, AIC-c = 10132.92 (81 estimated parameters)
* Relaxation/intensification parameter (K) =     4.74
* The following rate distribution was inferred for **test** branches

|          Selection mode           |     dN/dS     |Proportion, %|               Notes               |
|-----------------------------------|---------------|-------------|-----------------------------------|
|        Negative selection         |     0.106     |   98.029    |                                   |
|        Negative selection         |     0.148     |    1.945    |                                   |
|      Diversifying selection       |    30.916     |    0.026    |                                   |

----
## Test for relaxation (or intensification) of selection [RELAX]
Likelihood ratio test **p =   0.0000**.
>Evidence for *intensification of selection* among **test** branches _relative_ to the **reference** branches at P<=0.05

Best, Sergei Fig4E.nex.zip