Closed huddlej closed 7 months ago
Questions:
leave annotated_embeddings at the tip level, table.tsv will have internal nodes for plotting the divergence tree. Add another parameter for the dataframe table.tsv (add true/false column for internal node, add a filter where necessary)
root_and_prune_tree (under ha/na scripts) - use for removing strain (btw augur tree and refine)
filter by size, order, color by MCC after
Although we can generate stress values for MDS, I don't think including stress in the figures is a blocker to submission, so I'm closing this issue.
Considerations for Altair figures:
Considerations for Euclidean vs. genetic distance figures:
For the H3N2 HA/NA analysis:
flu-2016-2018-ha-na-embeddings-by-mcc.png
) to the supplement and rename toflu-2016-2018-ha-only-vs-ha-na-embeddings-by-mcc.png
flu-2016-2018-ha-na-embeddings-by-mcc.png
) with tree and PCA, MDS, t-SNE, and UMAP embeddings from HA/NA sequences and VI distances annotated in the titles from both the HA-only and HA/NA clustersFor late SARS-CoV-2 analysis: