Closed huddlej closed 2 years ago
After talking more with users who have tried to use standalone IQ-TREE instead of augur tree, there is still enough value provided by augur's wrapping of IQ-TREE that we should address this problem in augur.
A specific example of what we would like to avoid is in this IQ-TREE command generated by augur tree when the user provides custom arguments to --tree-builder-args
that duplicate arguments in our default command (the -ninit
and -n
arguments):
iqtree2 -ninit 2 -n 2 -me 0.05 -nt 10 -s results/filtered-delim.fasta -m GTR -ninit 10 -n 4
Augur should handle this by running the following command instead:
iqtree2 -me 0.05 -nt 10 -s results/filtered-delim.fasta -m GTR -ninit 10 -n 4
One potential technical solution that could work for all supported tree builders would be to 1) compose an ordered dictionary of all default arguments, 2) parse key/value pairs from user-provided arguments to --tree-builder-args
, and 3) update the dictionary with the user's arguments. This solution could be complicated by the need to provide flags in addition to arguments where there is no explicit key/value pair. We would need to parse these as "flag/True" pairs, perhaps. Another bigger issue with this approach is that we would need to support all of the various ways tree builders define command line arguments. Even if we make this solution work, the user won't have a way to know what the default arguments are that they are overriding. This lack of transparency could lead to unexpected behavior from any of the tree builders when users unknowingly provide arguments that conflict with our defaults.
Another solution that might be more generic and surface the defaults to users would be to move the per-builder default arguments into the default value of --tree-builder-args
. When users type augur tree -h
, they could see the default arguments for each tree builder. When users provide their own --tree-builder-args
, the given values would need to completely replace the original defaults. In the example above, running augur tree -h
would print:
iqtree2 -ninit 2 -n 2 -me 0.05 -nt 10 -s <input> -m GTR
When the user runs augur tree with --tree-builder-args "-ninit 10 -n 4"
, the resulting command would be:
iqtree2 -ninit 10 -n 4 -s <input>
Note that we have to allow augur tree to manage the input to iqtree because we create a copy of the input alignment prior to building the tree. As I write out this example, I also see a conflict between --tree-builder-args
and --substitution-model
, for IQ-TREE (and --nthreads
). We could let Augur handle inputs and substitution models as special cases to the general rule where the --tree-builder-args
replace the defaults. We just need this to be clear to users. For example, maybe the augur tree -h
command should print something like this instead:
iqtree2 -s <alignment> -m <substitution-model> -nt <nthreads> -ninit 2 -n 2 -me 0.05
Then users could at least see which values will be supplied by Augur and which defaults can be overridden. Maybe we could template the commands more generically like so:
iqtree2 -s <alignment> -m <substitution-model> -nt <nthreads> <tree-builder-args>
such that the default value of --tree-builder-args
for IQ-TREE is -ninit 2 -n 2 -me 0.05
.
Another solution that might be more generic and surface the defaults to users would be to move the per-builder default arguments into the default value of --tree-builder-args.
This is what I'd like to do.
We could let Augur handle inputs and substitution models as special cases to the general rule where the --tree-builder-args replace the defaults.
Yes - special cases will be unavoidable I think (e.g. the sequence input is technically a "tree builder arg"). I think explaining the behavior of augur tree
to users by
iqtree2 -s <alignment> -m <substitution-model> -nt <nthreads> <tree-builder-args>
is really clean. It clashes a little bit with other options, such as -fast
(a flag which we convert to tree-builder-args) and in these cases I'd propose erroring along the lines of "you can't run both of these together, add ... to your tree builder args to mimic behavior here".
Hi John and James,
This issue is currently hindering some of my work so I'd like to take it on board if that's okay with everyone?
It should be relatively easy to update augur tree to replace the 'fast' defaults with the user inputed tree args. One issue the documentation for the users but that can be simply changed.
Something I noticed in augur tree
is that the three build_*
functions are not harmonised. For example, build_iqtree
has a list of strings for the -fast
options but the other build functions do not. I think alongside the above changes, it's a good opportunity to create consistent tree builder functions where possible.
General overview of the things I plan to do:
tree_op
to contain the default tree building options (replaces fast_op
in build_iqtree
)tree_builder_args
is not supplied, tree_op
contains the default tree builder options, else tree_builder_args
is chosen.call
variable to unpack tree_op
augur tree -h
as indicated above and add a comment so user understands inputs-m
for --substitution-model
because it's really useful short hand!Thanks
Ammar
All yours @ammaraziz - thanks in advance!
Your approach looks generally good. (2) will be complicated by -fast
, but see above for a solution here. (5) should be a separate PR, or at least a separate commit in this PR.
Hi @ammaraziz, I'm just checking in to see how things are going with this issue. I've run into a similar need to override the defaults with custom arguments, so if you need someone to test your code, let me know. :)
Hi @huddlej,
A few things came up at work so I've had to put this aside. But I think by the end of this week or beginning next week I will have a working prototype.
I'd love for someone else to have a gander on my code and to test it. I will ping you here when it's ready :)
Thanks for your patience!
Hey @jameshadfield,
I have to be honest I don't understand the comments about -fast
. I understand that -fast
is converted to -ninit 2 -n 2 -me 0.05
for compatibility with older versions of iqtree
. But when it comes to in tree builder args, is the worry that -fast
will cause issues due to incompatibility with other iqtree
arguments? If so I assumed iqtree
would handle the error and report the issue. Or would we like this to be caught by augur
and present a friendlier error message.
Let me know!
Thanks,
Ammar
@ammaraziz No worries! I ended up hacking together one possible solution to this issue for a related project where it unavoidably necessary (calculating bootstrap values from IQ-TREE, something that doesn't work with our IQ-TREE defaults).
Would you want to try out my implementation and see if it solves your use case, too?
Hi @huddlej Your implementation was nearly identical to mine! Glad to see I was on the right path. I tested your 3e664c32077feca7d280a13ad7f5a411b2e69a40 commit and it works nicely. This was the command I ran:
... -substitution-model GTR+I --nthreads 10 --tree-builder-args "-B 1000 -alrt 1000"
worked nicely!
I tried breaking it by adding -fast
, this was the error:
Command output was:
Ultrafast bootstrap (-bb) does not work with -fast option
Which is nice.
Interestingly, these commands work:
... --substitution-model GTR+I --nthreads 10 --tree-builder-args "-m GTR+F"
... --substitution-model GTR+I --nthreads 10 --tree-builder-args "-nt 11"
Oddly iqtree does not complain. It seems iqtree takes the last argument given.
Finally broke it with the example for --tree-builder-args
:
... --substitution-model GTR+I --nthreads 10 --tree-builder-args "-czb"
error:
augur tree: error: argument --tree-builder-args: expected one argument
I think this is due to argparse
issue with negative values:
https://docs.python.org/3.2/library/argparse.html#arguments-containing
Not sure what the optimal solution is but I did notice that adding a space at the end of the supplied value fixes this issue, eg: --tree-builder-args "-czb "
. It's a bit hacky but we could append a space to the start/end of the tree-builder-args
value as a work around.
I hope this information was useful.
Excellent! Thank you for the detailed feedback and testing, @ammaraziz!
You're right that it's a little strange how IQ-TREE accepts duplicate input arguments. This behavior appears to be the root of #780, leading to segmentation faults under some conditions. A nicer interface might be for augur to check for these hardcoded flags in the user-provided tree-builder arguments and raise a helpful error message when they exist.
For the broken example in the help, the issue you describe is why the example uses an assignment operator style of argument instead of the space-delimited version. This code works for me, for example:
... --substitution-model GTR+I --nthreads 10 --tree-builder-args="-czb"
It looks like argparse will never get a fix for this bug, so the assignment style used above is the safest usage for augur tree.
There is now a PR #839 where we can continue this conversation, if you like. I'll try adding the check for hardcoded defaults and post a comment there when it's done.
Hi @huddlej, Glad I could be of help. I will put some comments in the PR.
Current Behavior
@llk578496 notes in issue 619 that they cannot provide all valid IQ-TREE arguments to augur tree through the
--tree-builder-args
argument.The specific error was:
The issue is that users would like to provide valid IQ-TREE arguments to augur tree, but the default arguments we use can conflict with those user-provided values.
Expected behavior
Users should be able to provide all valid arguments to IQ-TREE through augur tree.
Possible solution
One possible solution would be for the value of
--tree-builder-args
to completely replace any IQ-TREE default arguments we use in augur tree. The same logic would need to apply for FastTree and RAxML.A major issue with this solution is that users don't necessarily know what the original defaults were, so it could be tricky to create the proper
--tree-builder-args
string that includes the relevant defaults and new arguments.Another solution is to say we do not support arbitrary IQ-TREE arguments and that users should call IQ-TREE directly instead of using augur tree.