Closed sreichl closed 1 year ago
Pseudocode
with 100 trees and 0.025 ie 2.5% as max edge weight cut off -> does that mean 5% of cells want to go from cluster A to B or vice-versa? ARI 0.8619293526483519 NMI 0.8905384726712815
with 1000 trees and 0.025 ie 2.5% as max edge weight cut off -> does that mean 5% of cells want to go from cluster A to B or vice-versa? ARI 0.8738051377073799 NMI 0.8995254588822036
with 5000 trees and 0.025 ie 2.5% as max edge weight cut off -> does that mean 5% of cells want to go from cluster A to B or vice-versa? ARI 0.8689756028623556 NMI 0.8984957984189852
with 100 trees and 0.975 acc ARI 0.8570580695592698 NMI 0.898311524053469
with 1000 trees and 0.975 acc ARI 0.82217666507298 NMI 0.8769193958701921
with 5000 trees and 0.975 acc ARI 0.853381198544215 NMI 0.8879365153842752
candidates: max_weight, accuracy, f1_score, or a change in accuracy belo e.g., 0.05% Check if the accuracy threshold is met accuracy = accuracy_score(labels, new_labels) print(f"Accuracy: {accuracy}") f1 = f1_score(labels, new_labels, average='weighted') print(f"F1: {f1}")
@sreichl In addition to adding a stopping criterion and a recommended clustering, could you keep the clusterings at each merging step and return an interactive plot with a slider where the accuracy of the last merger is shown as text in the corner? So one could check how the clusters look for different thresholds and potentially pick one that looks good.
Of course then not directly comparable via indices, but might be a good thing to troubleshoot if only one big cluster is left and might not be too expensive to store the labels and recolor the UMAP.
thanks, added it to #28