Closed corneliusroemer closed 2 years ago
It looks like @huddlej added that logic, maybe you could explain. https://github.com/nextstrain/augur/commit/2df3483dc9797d547d70263fda5f73eb4a2399bc
The change could be as simple as removing -nt
from the list of conflicting arguments.
Alternatively, we could change how --nthreads auto
passed to augur tree is understood: passing -nt AUTO
to IQtree rather than -nt NUM_OF_CORES
. I can't see how this would make things worse for IQtree as using more threads than useful can slow IQtree down.
Hey @corneliusroemer, you may be interested in some of the earlier PRs where we discussed the implementation of "auto" CPU allocation for tree building including when @tsibley originally added the --nthreads auto
option and a more recent bug fix where we fallback to IQ-TREE's own auto option when there are more CPUs requested than sequences. You can see in the first PR listed how the goal of the "auto" interface was to provide a standard implementation across augur's multi-threaded tools (align and tree for now).
The commit of mine that you reference in this issue prevents users from trying to override arguments in the tree builder when we already provide an interface through augur tree to set those arguments. Since we already have a --nthreads
interface, we should make this interface behave the way we want. For example, if we allow overriding threads in tree-builder arguments, we could produce ambiguous commands like:
# How many threads do we use when we run this command?
augur tree --nthreads 2 --tree-builder-args "-nt auto"
If we want to support tool-specific interpretation of --nthreads auto
(which totally makes sense for IQ-TREE), we would need to update the argparse type for that argument to use a modified version of nthreads_value
that supports tree-builder-specific logic and lives in tree.py
. As it is currently implemented, augur tree
never knows the user provided "auto" as an argument, since the argparse logic converts that string to an integer during argument parsing.
Thanks for the background @huddlej
I'm conflicted over the recent commit that prevents users from setting certain options through augur tree. I get that it helps prevent confusion for users, but it comes at a definite cost of not allowing certain important IQtree options to be set. This feels like a bug to me.
Instead of erroring, we could warn instead - that would feel more appropriate.
Or if that's not enough, we could add a flag that allows this kind of override to be allowed.
Do you think it's feasible to rewrite how --nthreads
is used? We could make it work by changing the default value, too.
With IQtree's auto option it is simply not possible to say how many threads are used. And using all can be harmful.
We may want to make --nthreads
set -ntmax
as opposed to -nt
. That could solve all our problems.
http://www.iqtree.org/doc/Tutorial
For me this is less of a question whether we should address it but how to address it since the current implementation can cost a lot of extra compute time for no good reason.
The more I think about it, what I suggested above seems like the obvious solution:
--nthreads
as writing -ntmax
, the max number of threads to use and map this 1:1, throwing the override error as previously but for -ntmax
instead of -nt
-nt
to AUTO, but let users override if they want to.This would be a tiny change that comes without any disadvantages - as far as I can tell.
See IQtree docs:
http://www.iqtree.org/doc/Frequently-Asked-Questions#what-is-the-good-number-of-cpu-cores-to-use
I just commented on the PR above, which I think is a great solution. Regarding:
I get that it helps prevent confusion for users, but it comes at a definite cost of not allowing certain important IQtree options to be set. This feels like a bug to me.
There has always been this tension about how much we should protect users from ambiguous situations vs. give them complete control over everything behind the scenes. The general consensus to date has been that Augur should provide a convenient wrapper around tools with a standardized interface even if that comes at the expense of a little flexibility, but users are always free to replace Augur commands in their workflows with more sophisticated commands for the underlying tools (e.g., augur tree
-> iqtree
). This is what happened for us when we needed a non-standard mafft feature for the ncov workflow and we just called mafft directly instead of modifying augur align
to support that non-standard feature.
The tension is definitely a thing - luckily we can side step it here 😄
Context
IQtree has a flag
--nt AUTO
that automatically sets the optimal number of threads.Right now, when you use
augur tree
with--threads auto
, augur uses all threads, which is suboptimal. It can lead to slower runtime.To make matters worse: augur prevents me from setting
--nt AUTO
in IQtree myself, failing with:ERROR: The following tree builder arguments conflict with hardcoded defaults. Remove these arguments and try again: -nt
I can't see how this is helpful. Does anyone know how this came about? Was there a rationale?
We should change behaviour to allow IQtree to choose threads.