nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

Need for improved documentation for augur clades #849

Open corneliusroemer opened 2 years ago

corneliusroemer commented 2 years ago

Right now, the documentation for augur clades is lacking.

The docs section contains nothing but the CLI help: https://docs.nextstrain.org/projects/augur/en/stable/usage/cli/clades.html

There's a little bit scattered around. @emmahodcroft helpfully dug it up:

It would be good if the CLI docs were fleshed out with a description of how the algorithm works and a brief explanation of input and output files and their definitions.

To outsiders, the way augur clades works is currently totally opaque. Improved docs could potentially help in situations like these: https://github.com/nextstrain/ncov/issues/856

Emma wrote this:

Do we have documentation on exactly how clades works? As in - that it works phylogenetically, that the 'first' node is one where all specified mutations have accumulated (but may not all occur on that node), and that all children of that node will be part of the clade, until another clade definition is hit? I couldn't find anything, but it might be worth adding this somewhere. I found this , this, and this, but neither really explain 'how it works'. Might just be worth adding a sentence on the page about labelling clades? But wanted to check I've not missed it somewhere else before seeing if I can submit a PR or similar.

jameshadfield commented 2 years ago

See also https://github.com/nextstrain/augur/issues/735

tsibley commented 2 years ago

See also Slack discussion, including ideas about how to re-use doc in various places.