nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 128 forks source link

Refine throw `ValueError: x and y arrays must have at least 2 entries` #1061

Open corneliusroemer opened 1 year ago

corneliusroemer commented 1 year ago

Current Behavior

Running augur refine in the monkeypox build we get the following error: ValueError: x and y arrays must have at least 2 entries

Expected behavior

No error

How to reproduce

Augur 18.0.0 with treetime 0.9.4

wget https://github.com/nextstrain/augur/files/9794384/augur-1061.tar.zst.txt
tar xf augur-1061.tar.zst.txt
augur refine             --tree results/hmpxv1_big/tree_fixed.nwk    \
     --alignment results/hmpxv1_big/masked.fasta    --metadata results/hmpxv1_big/metadata.tsv   \
   --output-tree results/hmpxv1_big/tree.nwk        --timetree        --root MK783032 MK783030  \
   --precision 3       --keep-polytomies       --clock-rate 5.7e-05   --clock-std-dev 2e-5    \
   --output-node-data results/hmpxv1_big/branch_lengths.json    --coalescent opt        \
     --date-inference marginal      --date-confidence       --clock-filter-iqd 0

Probably an upstream issue in treetime but I'll open here just in case @anna-parker

Log:

augur refine is using TreeTime version 0.9.4
247.72      WARNING: Previous versions of TreeTime (<0.7.0) RECONSTRUCTED sequences
            of tips at positions with AMBIGUOUS bases. This resulted in unexpected
            behavior is some cases and is no longer done by default. If you want to
            replace those ambiguous sites with their most likely state, rerun with
            `reconstruct_tip_states=True` or `--reconstruct-tip-states`.
563.51      TreeTime.reroot: with method or node: ['MK783032', 'MK783030']
767.45      ###TreeTime.run: INITIAL ROUND
884.16      TreeTime.reroot: with method or node: ['MK783032', 'MK783030']
899.13      ###TreeTime.run: rerunning timetree after rerooting
1110.01     ###TreeTime.run: ITERATION 1 out of 2 iterations
1311.19     ###TreeTime.run: ITERATION 2 out of 2 iterations
1911.52     ###TreeTime.run: FINAL ROUND - confidence estimation via marginal
            reconstruction
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/treetime/treetime.py", line 57, in run
    return self._run(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/treetime/treetime.py", line 336, in _run
    self.make_time_tree(**tt_kwargs)
  File "/usr/local/lib/python3.7/site-packages/treetime/clock_tree.py", line 378, in make_time_tree
    self._ml_t_marginal()
  File "/usr/local/lib/python3.7/site-packages/treetime/clock_tree.py", line 649, in _ml_t_marginal
    merger_contribution = self.merger_model.node_contribution(node, time_points)
  File "/usr/local/lib/python3.7/site-packages/treetime/merger_models.py", line 240, in node_contribution
    return NodeInterpolator(t, y, is_log=True)
  File "/usr/local/lib/python3.7/site-packages/treetime/distribution.py", line 208, in __init__
    bounds_error=False, assume_sorted=True)
  File "/usr/local/lib/python3.7/site-packages/scipy/interpolate/interpolate.py", line 562, in __init__
    "least %d entries" % minval)
ValueError: x and y arrays must have at least 2 entries
ERROR: x and y arrays must have at least 2 entries 
anna-parker commented 1 year ago

For me this has quite unexpected behavior - I obtain the same error on the docker image (even with the newest augur and treetime versions) - but cannot seem to reproduce locally... I have a feeling it is due to some version difference in scipy... but Im not sure... also not very reassuring if it is due to a version difference.

corneliusroemer commented 1 year ago

And it failed again, this time the hmpxv1, not big(!) build, which should be faster to debug @anna-parker

[batch] [2022-10-17T18:49:29+02:00]         augur refine             --tree results/hmpxv1/tree_fixed.nwk             --alignment results/hmpxv1/masked.fasta             --metadata results/hmpxv1/metadata.tsv             --output-tree results/hmpxv1/tree.nwk             --timetree             --root MK783032 MK783030             --precision 3             --keep-polytomies             --clock-rate 5.7e-05             --clock-std-dev 2e-5             --output-node-data results/hmpxv1/branch_lengths.json             --coalescent opt             --date-inference marginal             --date-confidence             --clock-filter-iqd 0
[batch] [2022-10-17T18:49:29+02:00]         
[batch] [2022-10-17T19:05:36+02:00] Traceback (most recent call last):
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/treetime.py", line 57, in run
[batch] [2022-10-17T19:05:36+02:00]     return self._run(**kwargs)
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/treetime.py", line 336, in _run
[batch] [2022-10-17T19:05:36+02:00]     self.make_time_tree(**tt_kwargs)
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/clock_tree.py", line 378, in make_time_tree
[batch] [2022-10-17T19:05:36+02:00]     self._ml_t_marginal()
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/clock_tree.py", line 649, in _ml_t_marginal
[batch] [2022-10-17T19:05:36+02:00]     merger_contribution = self.merger_model.node_contribution(node, time_points)
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/merger_models.py", line 240, in node_contribution
[batch] [2022-10-17T19:05:36+02:00]     return NodeInterpolator(t, y, is_log=True)
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/treetime/distribution.py", line 208, in __init__
[batch] [2022-10-17T19:05:36+02:00]     bounds_error=False, assume_sorted=True)
[batch] [2022-10-17T19:05:36+02:00]   File "/usr/local/lib/python3.7/site-packages/scipy/interpolate/interpolate.py", line 562, in __init__
[batch] [2022-10-17T19:05:36+02:00] augur refine is using TreeTime version 0.9.4
[batch] [2022-10-17T19:05:36+02:00]     "least %d entries" % minval)
[batch] [2022-10-17T19:05:36+02:00] ValueError: x and y arrays must have at least 2 entries
[batch] [2022-10-17T19:05:36+02:00] 102.12      WARNING: Previous versions of TreeTime (<0.7.0) RECONSTRUCTED sequences
[batch] [2022-10-17T19:05:36+02:00] ERROR: x and y arrays must have at least 2 entries 
[batch] [2022-10-17T19:05:36+02:00]             of tips at positions with AMBIGUOUS bases. This resulted in unexpected
[batch] [2022-10-17T19:05:36+02:00]  
[batch] [2022-10-17T19:05:36+02:00]             behavior is some cases and is no longer done by default. If you want to
[batch] [2022-10-17T19:05:36+02:00] ERROR in TreeTime.run: An error occurred which was not properly handled in TreeTime. If this error persists, please let us know by filing a new issue including the original command and the error above at: https://github.com/neherlab/treetime/issues 
[batch] [2022-10-17T19:05:36+02:00]             replace those ambiguous sites with their most likely state, rerun with
[batch] [2022-10-17T19:05:36+02:00]             `reconstruct_tip_states=True` or `--reconstruct-tip-states`.
[batch] [2022-10-17T19:05:36+02:00] ERROR from TreeTime: An error occurred in TreeTime (see above). This may be due to an issue with TreeTime or Augur.
[batch] [2022-10-17T19:05:36+02:00] 231.48      TreeTime.reroot: with method or node: ['MK783032', 'MK783030']
[batch] [2022-10-17T19:05:36+02:00] Please report you are calling TreeTime via Augur. 
[batch] [2022-10-17T19:05:36+02:00] 316.99      ###TreeTime.run: INITIAL ROUND
[batch] [2022-10-17T19:05:36+02:00] 376.92      TreeTime.reroot: with method or node: ['MK783032', 'MK783030']
[batch] [2022-10-17T19:05:36+02:00] 382.19      ###TreeTime.run: rerunning timetree after rerooting
[batch] [2022-10-17T19:05:36+02:00] 480.68      ###TreeTime.run: ITERATION 1 out of 2 iterations
[batch] [2022-10-17T19:05:36+02:00] 575.07      ###TreeTime.run: ITERATION 2 out of 2 iterations
[batch] [2022-10-17T19:05:36+02:00] 866.56      ###TreeTime.run: FINAL ROUND - confidence estimation via marginal
[batch] [2022-10-17T19:05:36+02:00]             reconstruction

Data is here: s3://nextstrain-jobs/e1009501-f005-404c-8b9f-e40533da59fe.zip

augur-1061-small.tar.zst.txt

anna-parker commented 1 year ago

Oh so much better! Thanks @corneliusroemer

anna-parker commented 1 year ago

This error is related to: https://github.com/neherlab/treetime/pull/212. The node distribution is much smaller than the branch distribution and we will have to return the convolution with a delta function - the only difference here is that the node distribution literally turns into a delta function much earlier than we expected and the merger models have not been implemented to handle this case.

So in ml_t_marginal when the merger contribution is calculated product_of_child_msgs is a delta function, meaning that merger_contribution is also a delta function (a case we have not yet covered) and additionally multiplication of two delta functions is not yet implemented.

                   if hasattr(self, 'merger_model') and self.merger_model:
                        time_points = node.product_of_child_messages.x
                        # set multiplicity of node to number of good child branches
                        if node.is_terminal():
                            merger_contribution = Distribution(time_points, -self.merger_model.integral_merger_rate(time_points), is_log=True)
                        else:
                            merger_contribution = self.merger_model.node_contribution(node, time_points)
                        node.subtree_distribution = Distribution.multiply([merger_contribution, node.product_of_child_messages])
                    else:
                        node.subtree_distribution = node.product_of_child_messages
corneliusroemer commented 1 year ago

How nice that monkeypox helps us catch edge cases :)

anna-parker commented 1 year ago

@corneliusroemer - as a workaround for the future monkeypox builds - the sequence OP536786 has very likely been incorrectly dated as much older than it actually is and is pushing the date of it's ancestor and other nodes far back into the past. This leads to a bunch of numerical issues as the probability of these older dates is very low based on the other sequences.

When I add the sequence OP536786 to the exclude_accessions_mpvx.txt file the jobs run.

Instead of us having to add in all these weird fixes for numerical edge cases we should really find a better way to mark such nodes as bad_branches - the monkeypox tree just has such high global deviation in clock rate that the typical "bad branch test" is not marking enough of these branches as bad.... Hopefully that could stop us from having the monkeypox build failing all the time!

corneliusroemer commented 1 year ago

I know @anna-parker I fixed this already https://github.com/nextstrain/monkeypox/commit/7d128530283fd0b7e825e9d40b7a96fe3d3e9025

It's probably a good idea though to highlight such bad nodes in any case, and not just crash with a weird error (crashing is ok if there's no good fix, but we should have a helpful error message as well).

See this issue for making it easier to identify problematic sequences using clock filter output: https://github.com/nextstrain/augur/issues/1065

There are other things we should be doing but this is an easy thing since it should be free and is just not exported at the moment.