atarashansky / self-assembling-manifold

The Self-Assembling-Manifold (SAM) algorithm.
MIT License
41 stars 11 forks source link

SAM Algorithm - Clusters #38

Closed Sayyam-Shah closed 2 years ago

Sayyam-Shah commented 2 years ago

Hello,

How do I see Leiden clusters on my sam output? I ran the sam algorithm on my HSC clusters and got the below output. I cannot perform the Leiden algorithm with scanpy since it does not store the neighbour graph in the uns slot. Hence. how do I distinguish cell groups in the sam output?

image

Sayyam-Shah commented 2 years ago

It seems the neighbors slot is storing as an overloaded key in uns. How do I unpack this? image @atarashansky May you please comment on this?

atarashansky commented 2 years ago

sam.leiden_clustering(res=1.0) (res is the resolution parameter) will give you sam.adata.obs['leiden_clusters']

sam.scatter(c='leiden_clusters') for plotting it.

Sorry for the delay in my response!

andygxzeng commented 2 years ago

Hi Alex,

Just to add onto this - I think the issue may be more so one of compatibility with scanpy which relies on the adata.uns['neighbors'] slot that it is now no longer able to access with the recent update

One issue at present with running sc.external.tl.sam with the default inplace=True parameter is that it returns an adata object for further analysis downstream with scanpy, however running sc.tl.leiden returns an error (KeyError: 'No "neighbors" in .uns') and the sam.leiden_clustering function cannot be called on the adata object returned by the default sc.external.tl.sam call.

This would mean that a user would need to specifically call: samobj, adata = sc.external.tl.sam(adata, inplace=False) samobj.leiden_clustering(res = 1) adata = samobj.adata

in order to perform the clustering and return to downstream analysis with scanpy after calling SAM.

Would it be possible to ameliorate this, potentially by storing the neighbors as a normal key within adata.uns rather than an overloaded key, so that scanpy will know where to find the neighbourhood matrix for standard leiden clustering with sc.tl.leiden? I believe this was the case a few years/months ago but may have changed after the recent update as we are only now experiencing these compatibility issues with scanpy.

Thanks so much and thanks again for the great tool!

Andy

atarashansky commented 2 years ago

Hi Andy,

I have limited bandwidth at the moment so I'm not sure when I'll be able to post a fix - in the meantime, what you could do is:

sce.external.tl.sam(adata, inplace=True) sc.pp.neighbors(adata) sc.tl.leiden(adata)

The neighbors key in uns contains auxillary metadata about the kNN method used (e.g. number of nearest neighbors, method, etc) which for some reason is now required by the sc.tl.leiden. To generate that mandatory dictionary, you can just run the neighbors method again. It will use X_pca by default, which will be SAM's PCA coordinates.

Hopefully this works in the interim.