nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

Uncaught error when root not in tree. #1580

Open vanessazubach opened 2 months ago

vanessazubach commented 2 months ago

Current Behavior

 $ augur refine   --tree results/tree_raw.nwk   --alignment results/aligned.fasta   --metadata data/metadata.tsv   --output-tree results/tree.nwk   --output-node-data results/branch_lengths.json   --timetree   --coalescent opt   --date-confidence   --date-inference marginal  --stochastic-resolve  --root MMR24-0353_MVs/Quebec.CAN/14.24_B3_8395_2024-04-02 [MMR24-0424_MVs/Quebec.CAN/16.24_B3_8840_2024-04-15 MMR24-0425_MVs/Quebec.CAN/16.24/2_B3_8395_2024-04-15 MMR24-0440_MVs/Ontario.CAN/16.24_B3_8395_2024-04-16 MMR24-0494_MVs/Ontario.CAN/18.24_B3_8395_2024-04-29 MMR24-0496_MVs/Ontario.CAN/18.24/2_B3_8395_2024-05-01]
augur refine is using TreeTime version 0.11.3

1.41    TreeTime.reroot: with method or node:
        ['MMR24-0353_MVs/Quebec.CAN/14.24_B3_8395_2024-04-02',
        '[MMR24-0424_MVs/Quebec.CAN/16.24_B3_8840_2024-04-15',
        'MMR24-0425_MVs/Quebec.CAN/16.24/2_B3_8395_2024-04-15',
        'MMR24-0440_MVs/Ontario.CAN/16.24_B3_8395_2024-04-16',
        'MMR24-0494_MVs/Ontario.CAN/18.24_B3_8395_2024-04-29',
        'MMR24-0496_MVs/Ontario.CAN/18.24/2_B3_8395_2024-05-01]']
Traceback (most recent call last):
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/treetime/treetime.py", line 57, in run
    return self._run(**kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/treetime/treetime.py", line 228, in _run
    self.reroot(root=root, clock_rate=fixed_clock_rate)
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/treetime/treetime.py", line 525, in reroot
    new_root = self.tree.common_ancestor(root)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/Bio/Phylo/BaseTree.py", line 439, in common_ancestor
    raise ValueError(f"target {t!r} is not in this tree")
ValueError: target '[MMR24-0424_MVs/Quebec.CAN/16.24_B3_8840_2024-04-15' is not in this tree

ERROR: target '[MMR24-0424_MVs/Quebec.CAN/16.24_B3_8840_2024-04-15' is not in this tree

ERROR in TreeTime.run: An error occurred which was not properly handled in TreeTime. If this error persists, please let us know by filing a new issue including the original command and the error above at: https://github.com/neherlab/treetime/issues

ERROR from TreeTime: An error occurred in TreeTime (see above). This may be due to an issue with TreeTime or Augur.
Please report you are calling TreeTime via Augur.

Expected behavior

I am trying to run Augur refine. I have added the --root because I know sample 353 was the root to 424, 425,440, 494 and 496, but I am getting these errors saying sequence 424 is not in the tree. When I ran this previously without adding the root argument it was fine, however the cluster with these samples did not start at 353.

joverlee521 commented 2 months ago

Hi @vanessazubach,

ERROR: target '[MMR24-0424_MVs/Quebec.CAN/16.24_B3_8840_2024-04-15' is not in this tree

It looks like you have an extra [ in front of the sample name. Can you try re-running after removing the [ from the command?

vanessazubach commented 2 months ago

Thank you. That did the trick. However the tree it did not work out how I was hoping for it to. Is there a way to define clades as monophyletic or define a common ancestor? I know epidemiologically that sample 353 is the ancestor to 424, 425, 440, 494 and 496, so why are those samples not branching off of that sample?

corneliusroemer commented 2 months ago

Hi @vanessazubach I've also gotten this error before and we should catch it and abort with a clear message to the user about their error.

You can pass a constraint tree to iqtree if you want to enforce a certain topology. But this might be overkill. I'd need more context to help with your question. Maybe you can post your general question re tree topology being different than expected ok discussion.nextstrain.org? The forum is better suited for that sort of broad question.

victorlin commented 2 months ago

This should be handled by TreeTime. @corneliusroemer are you able to transfer this issue to neherlab/treetime?

corneliusroemer commented 2 months ago

No, it should be handled by Augur. Treetime can't do anything about a user error. The problem is that augur doesn't catch it.

Augur should catch and abort with an appropriate message but not throw an exception.

victorlin commented 2 months ago

Would this not happen for users of TreeTime CLI? This is the error from TreeTime:

ERROR in TreeTime.run: An error occurred which was not properly handled in TreeTime. If this error persists, please let us know by filing a new issue including the original command and the error above at: https://github.com/neherlab/treetime/issues

victorlin commented 2 months ago

A better explanation: right now it is throwing TreeTimeUnknownError. There's no way to differentiate user error vs. code error with that exception class. Augur expects these to be handled by TreeTime throwing TreeTimeError with a meaningful error message, for example:

raise TreeTimeError(f"Malformed VCF file {vcf_file!r} - all the meta-information (lines starting with ##) must appear at the top of the file.")
corneliusroemer commented 2 months ago

You're right @victorlin!

vanessazubach commented 2 months ago

Thank you all for your quick attention to this! And thank you @corneliusroemer for your suggestion to provide more information about my scenario and put it in the discussion.nextrain.org, I will do that.