Convergence issue with RELAX

veg / hyphy

HyPhy: Hypothesis testing using Phylogenies

Other

201 stars 68 forks source link

	LRP	p	k
REP_00	13.84	0.00020	0.04101
REP_01	15.22	0.00010	0.04210
REP_02	14.86	0.00012	0.04876
REP_03	16.39	0.00005	0.00002
REP_04	14.85	0.00012	0.04841
REP_05	15.24	0.00009	0.04116
REP_06	14.83	0.00012	0.04830
REP_07	14.85	0.00012	0.04829
REP_08	0.01	0.90859	2.70721
REP_09	14.83	0.00012	0.04827

LRP

REP_00

13.84

0.00020

0.04101

REP_01

15.22

0.00010

0.04210

REP_02

14.86

0.00012

0.04876

REP_03

16.39

0.00005

0.00002

REP_04

14.85

0.00012

0.04841

REP_05

15.24

0.00009

0.04116

REP_06

14.83

0.00012

0.04830

REP_07

14.85

0.00012

0.04829

REP_08

0.01

0.90859

2.70721

REP_09

14.83

0.00012

0.04827

Dear @shenglin-liu,

Unfortunately, there is no reliable way to diagnose, from a single run, whether or not a complex mixture model like RELAX converged. From the example you show above, 8/10 runs basically gave you the same answer (relaxed selection with a p-value on the order of 0.0001), so that's pretty good.

RELAX does some internal checks during a run and if detects some convergence issues, it will attempt to resolve them during the same run (transparent to the user). However, as these are heuristic, you are not guaranteed success.

--starting-points N does not perform N distinct RELAX runs. What is does instead, is perform N quick and dirty optimizations at the start of the run, then picks the best result, and runs the full model using it as the starting point. The benefit is that, as you noticed, runtimes are not strongly affected.

If all you want is the standard RELAX test, and you wish to reduce (but not eliminate) issues due to "misconvergence", I suggest the following options

hyphy CPU=1 relax --starting-points 100 --grid-size 2000 --models Minimal --alignment gene.fa --tree tree.nh --test test --reference ref --output out.json > out.log

Setting --grid-size to a higher value makes the runs a bit slower, but improves the ability of earlier optimization runs to find a good starting point. Setting --models Minimal avoids fitting even MORE complex models (like the General Descriptive).

HTH, Sergei

PS If your alignments are smaller, and if you see Collapsed rate classes in the output, consider reducing the number of rate classes to 2, using --rates 2

Selection mode	dN/dS	Proportion, %	Notes
Negative selection	0.000	47.395
Negative selection	0.000	5.352	Collapsed rate class
Diversifying selection	5.138	47.253

veg / hyphy

Convergence issue with RELAX #1581