init clustering with "User" specified doesn't work if user supplied clusters are not consecutive integers

kieranrcampbell commented 3 months ago

When utility.init_clustering is called with initial_clustering_method == "User" assumes that the user supplied clusters in labels are consecutive integers, which then breaks at

init_e[c, :] = adata.X[init_l == c].mean(0)

if e.g. init_l is a string.

The solution isn't as simple as casting labels to an int as no guarantee labels are either integers or consecutive.

Would recommend something like

cluster_idx_map = zip(range(unique_labels), unique_labels)

then doing

adata.X[init_l == cluster_idx_map[c]]

SarahAsbury commented 3 months ago

Also ran into issues if using 1-indexed integers instead of 0-indexed integers for clusters

cklamann commented 2 months ago

Closed with #45. @SarahAsbury if your problem persists, please open a separate issue. Thanks!

camlab-bioml / starling

init clustering with "User" specified doesn't work if user supplied clusters are not consecutive integers #44