TLDR; Why is cluster 344 larger than cluster 6? I thought they were assigned cluster numbers based on cell count.
I am working with a very large (180k+) scRNA+scATAC multiome dataset. After clustering the data in WNN space using FindClusters(mydata, graph.name = "wsnn", algorithm = 3, resolution=0.01), I am getting hundreds of clusters as described previously (https://github.com/satijalab/seurat/discussions/5427). The majority of these clusters have on average around 5 cells. However, cluster 344 has almost 7k cells. The count of cells per cluster is shown below.
I was under the impression that the cluster numbers were assigned by the number of cells assigned to each cluster, where 0 is the cluster with the greatest number of cells- is that incorrect? Additionally, I was using algorithm 3 based on the WNN scRNA+scATAC vignette (https://satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis#wnn-analysis-of-10x-multiome-rna-atac). Was there a reason that algorithm 3 was selected for this analysis over the other options?
Hi Megan, you are correct that cluster numbers are assigned in order by cluster size. However, after initial cluster assignments, any singletons that are present are then re-assigned to other larger clusters. I would guess that in this case, a very large number of singletons were assigned to cluster 344, and as a result, it ended up being one of the larger clusters despite starting off much smaller.
TLDR; Why is cluster 344 larger than cluster 6? I thought they were assigned cluster numbers based on cell count.
I am working with a very large (180k+) scRNA+scATAC multiome dataset. After clustering the data in WNN space using
FindClusters(mydata, graph.name = "wsnn", algorithm = 3, resolution=0.01)
, I am getting hundreds of clusters as described previously (https://github.com/satijalab/seurat/discussions/5427). The majority of these clusters have on average around 5 cells. However, cluster 344 has almost 7k cells. The count of cells per cluster is shown below.I was under the impression that the cluster numbers were assigned by the number of cells assigned to each cluster, where 0 is the cluster with the greatest number of cells- is that incorrect? Additionally, I was using algorithm 3 based on the WNN scRNA+scATAC vignette (https://satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis#wnn-analysis-of-10x-multiome-rna-atac). Was there a reason that algorithm 3 was selected for this analysis over the other options?
Hi Megan, you are correct that cluster numbers are assigned in order by cluster size. However, after initial cluster assignments, any singletons that are present are then re-assigned to other larger clusters. I would guess that in this case, a very large number of singletons were assigned to cluster 344, and as a result, it ended up being one of the larger clusters despite starting off much smaller.