nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

BUG: tree should accept `--nt AUTO` for IQtree, not error #1041

Closed corneliusroemer closed 2 years ago

corneliusroemer commented 2 years ago

Context

IQtree has a flag --nt AUTO that automatically sets the optimal number of threads.

Right now, when you use augur tree with --threads auto, augur uses all threads, which is suboptimal. It can lead to slower runtime.

To make matters worse: augur prevents me from setting --nt AUTO in IQtree myself, failing with:

image

ERROR: The following tree builder arguments conflict with hardcoded defaults. Remove these arguments and try again: -nt

I can't see how this is helpful. Does anyone know how this came about? Was there a rationale?

We should change behaviour to allow IQtree to choose threads.

corneliusroemer commented 2 years ago

It looks like @huddlej added that logic, maybe you could explain. https://github.com/nextstrain/augur/commit/2df3483dc9797d547d70263fda5f73eb4a2399bc

The change could be as simple as removing -nt from the list of conflicting arguments.

Alternatively, we could change how --nthreads auto passed to augur tree is understood: passing -nt AUTO to IQtree rather than -nt NUM_OF_CORES. I can't see how this would make things worse for IQtree as using more threads than useful can slow IQtree down.

huddlej commented 2 years ago

Hey @corneliusroemer, you may be interested in some of the earlier PRs where we discussed the implementation of "auto" CPU allocation for tree building including when @tsibley originally added the --nthreads auto option and a more recent bug fix where we fallback to IQ-TREE's own auto option when there are more CPUs requested than sequences. You can see in the first PR listed how the goal of the "auto" interface was to provide a standard implementation across augur's multi-threaded tools (align and tree for now).

The commit of mine that you reference in this issue prevents users from trying to override arguments in the tree builder when we already provide an interface through augur tree to set those arguments. Since we already have a --nthreads interface, we should make this interface behave the way we want. For example, if we allow overriding threads in tree-builder arguments, we could produce ambiguous commands like:

# How many threads do we use when we run this command?
augur tree --nthreads 2 --tree-builder-args "-nt auto"

If we want to support tool-specific interpretation of --nthreads auto (which totally makes sense for IQ-TREE), we would need to update the argparse type for that argument to use a modified version of nthreads_value that supports tree-builder-specific logic and lives in tree.py. As it is currently implemented, augur tree never knows the user provided "auto" as an argument, since the argparse logic converts that string to an integer during argument parsing.

corneliusroemer commented 2 years ago

Thanks for the background @huddlej

I'm conflicted over the recent commit that prevents users from setting certain options through augur tree. I get that it helps prevent confusion for users, but it comes at a definite cost of not allowing certain important IQtree options to be set. This feels like a bug to me.

Instead of erroring, we could warn instead - that would feel more appropriate.

Or if that's not enough, we could add a flag that allows this kind of override to be allowed.

Do you think it's feasible to rewrite how --nthreads is used? We could make it work by changing the default value, too.

With IQtree's auto option it is simply not possible to say how many threads are used. And using all can be harmful.

We may want to make --nthreads set -ntmax as opposed to -nt. That could solve all our problems.

image

http://www.iqtree.org/doc/Tutorial

For me this is less of a question whether we should address it but how to address it since the current implementation can cost a lot of extra compute time for no good reason.

corneliusroemer commented 2 years ago

The more I think about it, what I suggested above seems like the obvious solution:

  1. Interpret --nthreads as writing -ntmax, the max number of threads to use and map this 1:1, throwing the override error as previously but for -ntmax instead of -nt
  2. We should then default -nt to AUTO, but let users override if they want to.

This would be a tiny change that comes without any disadvantages - as far as I can tell.

See IQtree docs:

image

http://www.iqtree.org/doc/Frequently-Asked-Questions#what-is-the-good-number-of-cpu-cores-to-use

huddlej commented 2 years ago

I just commented on the PR above, which I think is a great solution. Regarding:

I get that it helps prevent confusion for users, but it comes at a definite cost of not allowing certain important IQtree options to be set. This feels like a bug to me.

There has always been this tension about how much we should protect users from ambiguous situations vs. give them complete control over everything behind the scenes. The general consensus to date has been that Augur should provide a convenient wrapper around tools with a standardized interface even if that comes at the expense of a little flexibility, but users are always free to replace Augur commands in their workflows with more sophisticated commands for the underlying tools (e.g., augur tree -> iqtree). This is what happened for us when we needed a non-standard mafft feature for the ncov workflow and we just called mafft directly instead of modifying augur align to support that non-standard feature.

corneliusroemer commented 2 years ago

The tension is definitely a thing - luckily we can side step it here 😄