nextstrain / mpox

Nextstrain build for mpox virus
https://nextstrain.org/mpox
MIT License
45 stars 19 forks source link

Reached maximum recursion depth when building phylogenetic trees #282

Open mengfeip opened 2 weeks ago

mengfeip commented 2 weeks ago

I currently include more than 8000 sequences for a Nextstrain build.

When constructing the phylogenetic tree with augur tree, it always alerts maximum recursion depth reached as a fatal error, even when setting augur_recursion_limit to 100,000,000.

What could be the maximum value we can set for recursion depth, and can it be set as infinite?

rule tree: input: results/masked.fasta output: results/tree_raw.nwk reason: Missing output files: results/tree_raw.nwk; Input files updated by another job: results/masked.fasta

    export AUGUR_RECURSION_LIMIT=100000000
    augur tree             --alignment results/masked.fasta             --exclude-sites config/tree_mask.tsv             --tree-builder-args="-redo"             --output results/tree_raw.nwk             --nthreads 44

FATAL: Maximum recursion depth reached. You can set the env variable AUGUR_RECURSION_LIMIT to adjust this (current limit: 100000000) 5 masking sites read from config/tree_mask.tsv Building a tree via: iqtree2 -ntmax 44 -s results/Lineage-B-public/masked_masked-delim.fasta -m GTR -ninit 2 -n 2 -me 0.05 -nt AUTO -redo -redo > results/Lineage-B-public/masked_masked-delim.iqtree.log Nguyen et al: IQ-TREE: A fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol. Biol. Evol., 32:268-274. https://doi.org/10.1093/molbev/msu300 (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job tree since they might be corrupted: results/Lineage-B-public/tree_raw.nwk Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message

jameshadfield commented 2 weeks ago

When constructing the phylogenetic tree with augur tree, it always alerts maximum recursion depth reached as a fatal error, even when setting augur_recursion_limit to 100,000,000. What could be the maximum value we can set for recursion depth, and can it be set as infinite?

You can try setting higher limits, the ceiling is platform dependent (for me it's 2**31-1), however I suspect the memory usage will be prohibitive and/or python will crash as the stack size allocated to the python interpreter is finite.

I currently include more than 8000 sequences for a Nextstrain build.

That's quite a lot. Certainly for downstream analysis in Auspice a tree with that many samples will be very slow to interact with. Is it possible to use fewer samples, or perhaps run individual clade-level analyses?

mengfeip commented 2 weeks ago

Thank you @jameshadfield

It is strange because I never need to set higher limits and have never encountered this recursion limit issue even with 8000-10000 sequences before. My best guess is some limits were introduced during an Augur version update, or some changes on Augur tree functions.

jameshadfield commented 2 weeks ago

My best guess is some limits were introduced during an Augur version update, or some changes on Augur tree functions.

I don't think any changes have been made to augur tree for a while. The recursion limits are often a sign of an unbalanced tree (although they will also be reached on any big-enough tree), so it is pathogen dependent.