veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
201 stars 68 forks source link

Convergence Issue With FEL #1618

Closed gykoh closed 10 months ago

gykoh commented 1 year ago

Hello!

I wanted to see if running FEL under the default settings twice would still end up with same or very similar results. For both p = 0.1 and p = 0.05, I seem to have run into a convergence issue.

Default Settings For FEL:

The bolded sites are sites found in only one of the two runs.

First Run (p = 0.1)

Second Run (p = 0.1)

First Run (p = 0.05)

Second Run (p = 0.05)

Questions:

  1. What property of the FEL method is creating this issue?
  2. What is the recommended approach for getting convergence in results? a. Should I change the settings until this convergence issue is resolved? b. Or should I run a number of times and consider the sites that show up in all of the runs? c. Something else?
spond commented 1 year ago

Dear @gykoh,

Try running with CPU=1 command line argument. The stochasticity comes from multi threading. I’ll adjust FEL initial value guessing and tighten convergence criteria. I bet the unstable sites are near significance cutoff.

Best, Sergei

gykoh commented 1 year ago

Dear @spond,

Thank you for your reply.

I wasn’t sure how to restrict the run to CPU = 1, but here are my command lines below for p = 0.1 and p = 0.05:

hyphy FEL --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No --CPU 1

hyphy FEL --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.05 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No --CPU 1

The original command lines below (from the results posted above) for p = 0.1 and p = 0.05 are below:

hyphy FEL --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No

hyphy FEL --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.05 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No

The results are still inconsistent based on two runs with CPU=1:

First Run (p = 0.1)

Second Run (p = 0.1)

First Run (p = 0.05)

Second Run (p = 0.05)

The bolded sites are sites found in only one of the runs and not in the other run too.

  1. I am not sure I entirely understand the reason why using multithreading would lead to these inconsistencies. In any case, based on my attempt above, CPU=1 did not solve the issue.
  2. What could be the cause for the inconsistencies?

Thank you very much for your help.

stephenshank commented 1 year ago

Dear @gykoh,

Floating point arithmetic can depend on the order in which it is executed. For instance, (a+b)+c does not necessarily equal a+(b+c). They are only approximately equal up to a round-off error, and given that these problems can be ill-conditioned these minor errors can accumulate and be amplified. Since the arithmetic does not always execute in the same order when multithreading is used (there is some stochasticity in the operating system), this is what can yield discrepant results between runs.

Based on output from hyphy --help, specifically:

  CPU=integer              if compiled with OpenMP multithreading support, requests this many threads; HyPhy could use fewer than this
                           but never more; default is the number of CPU cores (as computed by OpenMP) on the system

I think the invocation you want is

hyphy CPU=1 fel --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No 

In particular, note the subtle difference between CPU=1 and --CPU 1. The presence of the CPU=1 flag yielded different runtimes on my system, as a small sanity check.

I'm curious to see the results! Please let us know if you have other questions.

Regards, Stephen

gykoh commented 1 year ago

Dear @stephenshank,

Thank you for your prompt reply.

Here are my command lines I ran for p = 0.1 at CPU = 1:

hyphy CPU=1 fel --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci No --resample 50 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No

The results are still inconsistent based on the two runs below:

First Run (p = 0.1)

Second Run (p = 0.1)

The bolded sites are sites found in only one of the runs and not in the other run too.

Any thoughts on what might be happening?

Thank you very much for your help!

spond commented 1 year ago

Dear @gykoh,

Because you are using the resample option, which implements parametric bootstrap, you should expect stochasticity, especially since you are using only 50 replicates. For more stability, increase the number of replicates.

Best, Sergei

spond commented 1 year ago

Dear @gykoh,

Just following up to see if you were able to resolve the issue.

Best, Sergei

gykoh commented 1 year ago

Dear @spond,

Thank you for the follow up. I have tried to run FEL at bootstrap set to 1000 in a cluster, but I still found a convergence issue (although less than before). Then, I found out that the HyPhy version in the server was very old. We updated it and are currently re-running two runs of the analysis with 1,000 bootstraps. I will send a full report as soon as that is done.

Note: The runs I have talked about in my previous posts have used the most updated HyPhy version. The outdated version was in the server we are using for the bootstrap runs.

Thank you!

gykoh commented 1 year ago

Dear @spond,

I have an update about the bootstraps and follow up questions. I am sorry that this is a long post.

I ran FEL with --resample 1000 HyPhy version 2.5.51 twice for p = 0.1, ci = Yes, but the sites from my two runs did not perfectly match up as shown below:

hyphy CPU= 20 fel --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci Yes --resample 1000 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No

Positive:

Negative:

The bolded sites are sites found in only one of the runs and not in the other run too.

As shown by the two runs, the results improved but are not perfect. Interestingly, codon sites 197 and 285 do not show up in the second run, but codon site 99 does in both runs despite codon site 99 having a p-value than the other two sites:

p-values Of First Run For Sites Under Diversifying Positive Selection: 202 p = 0.0100 178 p = 0.0130 86 p = 0.0200 92 p = 0.0300 254 p = 0.0330 64 p = 0.0340 243 p = 0.0370 251 p = 0.0390 262 p = 0.0480 250 p = 0.0539 187 p = 0.0569 252 p = 0.0639 70 p = 0.0649 242 p = 0.0669 9 p = 0.0849 197 p = 0.0879 285 p = 0.0949 99 p = 0.0969

p-values Of Second Run For Sites Under Diversifying Positive Selection: 202 p = 0.0100 178 p = 0.0240 92 p = 0.0280 86 p = 0.0290 243 p = 0.0350 64 p = 0.0380 262 p = 0.0390 254 p = 0.0410 251 p = 0.0460 70 p = 0.0490 252 p = 0.0490 250 p = 0.0529 187 p = 0.0629 242 p = 0.0729 9 p = 0.0749 99 p = 0.0989

Question 1: It seems that increasing the bootstrap still did not perfectly resolve the convergence issue (although there are less discrepant sites now). Any thoughts on this?

Not only was there a discrepancy between these two runs with 1000 bootstraps, but even the sites which agree between the two runs are not the same as the ones I get when I simply run fel with -resample 0 and -ci No (3 runs, same results, as expected):

hyphy fel --code Universal --alignment my_alignment.fa --tree my_tree_unrooted.nex --branches All --srv Yes --pvalue 0.1 --ci No --resample 0 --output my_output.json --precision standard --full-model Yes --kill-zero-lengths No

First, Second, & Third Runs: Sites Under Diversifying Positive Selection (20 sites): 9, 56, 64, 70, 86, 92, 99, 168, 178, 187, 197, 202, 242, 243, 250, 251, 252, 254, 262, 285 Sites Under Negative Selection (30 sites): 17, 24, 25, 27, 28, 32, 37, 45, 91, 93, 96, 111, 115, 117, 123, 135, 162, 167, 170, 190, 207, 208, 221, 241, 274, 279, 284, 308, 312, 316

Question 2: If results change this much between running the default settings and the maximum amount of bootstraps, how would you recommend choosing a reliable set of sites with strong evidence for dN/dS>1?

Thank you!

github-actions[bot] commented 11 months ago

Stale issue message