We define cluster confidence biologically by the number of cluster-specific mutations we find in the mutation tables produced by the cluster_mutation.py script. To generalize this idea of cluster confidence or support, we added a new pathogen-cluster-mutations command to the pathogen-embed toolkit which replaces the cluster_mutation.py script.
Development checklist
[x] Update workflows to use pathogen-cluster-mutations instead of cluster_mutation.py
[x] Add rules to workflows to generate mutation tables for Nextstrain clades
[x] Rebuild the final mutation table across all workflows to include Nextstrain clade mutations
[x] Update the mutation table caption to reflect the new data included
[x] Summarize the distribution of the number of cluster-specific mutations per pathogen dataset and genetic group
Description
We define cluster confidence biologically by the number of cluster-specific mutations we find in the mutation tables produced by the
cluster_mutation.py
script. To generalize this idea of cluster confidence or support, we added a newpathogen-cluster-mutations
command to the pathogen-embed toolkit which replaces thecluster_mutation.py
script.Development checklist
pathogen-cluster-mutations
instead ofcluster_mutation.py
Related issues
Closes #111 Closes #112