epigen / unsupervised_analysis

A general purpose Snakemake workflow to perform unsupervised analyses (dimensionality reduction & cluster analysis) and visualizations of high-dimensional data.
MIT License
20 stars 3 forks source link

clustification: add visualization/diagnostics of performance/convergence over time #28

Open sreichl opened 11 months ago

sreichl commented 11 months ago

determine metrics at every iteration and plot at the end the time course. at least for the stopping criterion max. edge weight, but maybe also for f1 score and accuracy,....

sreichl commented 8 months ago

from @bednarsky In addition to adding a stopping criterion and a recommended clustering, could you keep the clusterings at each merging step and return an interactive plot with a slider where the accuracy of the last merger is shown as text in the corner? So one could check how the clusters look for different thresholds and potentially pick one that looks good.

Of course then not directly comparable via indices, but might be a good thing to troubleshoot if only one big cluster is left and might not be too expensive to store the labels and recolor the UMAP.