Open corneliusroemer opened 1 year ago
We have also released CMAPLE tool, https://github.com/iqtree/cmaple, which is 3 times more efficient than MAPLE (https://doi.org/10.1101/2024.05.15.594295). Is there any plan to integrate such tool into the pipeline? We @trongnhanuit can volunteer to integrate CMAPLE.
Thanks, @bqminh! Based on the IQ-TREE docs, it looks like Augur might support CMAPLE already by passing custom arguments to IQ-TREE like augur tree --tree-builder-args="--pathogen-force"
(where IQ-TREE is already the default tree-builder). Another option to surface CMAPLE which would require a change to Augur would be to wrap that custom IQ-TREE command through a new "method" option like augur tree --method cmaple
.
The other technical consideration is that we bundle IQ-TREE with Augur in our Nextstrain runtimes for Conda and Docker. For the Conda runtime, we pull the IQ-TREE package from Bioconda. For the Docker runtime, we download a (slightly out-of-date) binary from GitHub. For CMAPLE to work with augur tree
across our various runtimes, we'd just need the Bioconda package and GitHub binaries to reflect the CMAPLE branch of the code. Separately, we are eager to include the latest version of IQ-TREE that supports ARM64 CPUs, but it looks like that development is happening in a separate branch from the CMAPLE work. Is there a plan to have a single release with both CMAPLE and ARM64 support or will these remain as separate development paths for a while?
Thank you for this information! We'll prioritise to have this IQ-TREE/CMAPLE version work on ARM. It's good to know that it might work already with this tree builder arguments, but we'll also consider other options.
I've managed to build iqtree2+cmaple on my local machine (osx-arm64 macOS 14.6) with a few workarounds, see iqtree2 issue:
Per the logs, this time it really worked (I tried with bioconda version but that lacks cmaple support, see https://github.com/iqtree/iqtree2/issues/274)
IQ-TREE multicore version 2.3.5 for MacOS ARM 64-bit built Jul 17 2024
Developed by Bui Quang Minh, Nguyen Lam Tung, Olga Chernomor, Heiko Schmidt,
Dominik Schrempf, Michael Woodhams, Ly Trong Nhan, Thomas Wong
Host: dyn-3-4-29.mobile.unibas.ch (SSE4.2, 32 GB RAM)
Command: iqtree2 -ntmax 4 -s results/hmpxv1/masked_masked-delim.fasta -m GTR -ninit 2 -n 2 -me 0.05 -nt AUTO -redo --pathogen-force
Seed: 131082 (Using SPRNG - Scalable Parallel Random Number Generator)
Time: Wed Jul 17 16:05:07 2024
Kernel: SSE2 - auto-detect threads (10 CPU cores detected)
Reading an alignment
Running [C]MAPLE algorithm...
Performing placement
243 sequences have been added to the tree.
Applying a normal tree search
Optimizing branch lengths
Tree log likelihood: -272539.7511423723
MODEL: GTR
ROOT FREQUENCIES
A C G T
0.365181 0.157898 0.157473 0.319448
MUTATION MATRIX
A C G T
A -2552.16 317.651 1864.83 369.68
C 734.649 -5636.53 455.989 4445.89
G 4324.56 457.223 -5524.77 742.987
T 422.604 2197.54 366.257 -2986.4
Analysis results written to:
Maximum-likelihood tree: results/hmpxv1/masked_masked-delim.fasta.treefile
Screen log file: results/hmpxv1/masked_masked-delim.fasta.log
CMAPLE Runtime: 0.9459710121s
Date and Time: Wed Jul 17 16:05:08 2024
On a small build (240 sequences, mpox clade IIb) things look good:
I'll keep exploring. I think I can edit the bioconda recipe to add cmaple so we can use it broadly across workflows. See:
I've managed to build iqtree with cmaple feature enabled in bioconda! There's thus no need to change augur code, one can simply pass the tree builder argument --pathogen-force
and cmaple should be used automatically.k
Context
IQtree can struggle with large trees and take long. We may want to experiment with using Usher and/or Maple as alternatives. They probably are significantly faster and may be good enough for some use cases, maybe even better than IQtree.