labsyspharm / scimap

Spatial Single-Cell Analysis Toolkit
https://scimap.xyz/
MIT License
72 stars 24 forks source link

strange spatial_LDA -> spatial_cluster results #28

Closed yerahko closed 2 years ago

yerahko commented 2 years ago

Hello again!

When clustering (spatial_cluster) on spatial_LDA results, I am getting strange results, as below.

I always get reasonable spatial_cluster results when training on a single ROI. But with as few as 2 ROIs, I start to get this artifactual-seeming result, visible as clusters forming vertical stripes in one or more of the ROIs.

I have tried both 'knn' and 'radius' as spatial_LDA methods with varying values of motifs, knn, and radius. Clustering method was always kmeans (leiden and phenograph were always giving me 99 clusters even with resolution set to 0.1—so I am actually not sure if it's spatial_LDA or instead the clustering that is contributing to this)

Conditions which promote the appearance of this "artifact":

Example of a "sensible" spatial clustering result:

image

when one additional ROI is trained together with it, with all the same spatial_LDA and spatial_cluster parameters, that ROI becomes:

image

Some real structure is retained in the lower left corner, while the right side no longer makes sense...

Any idea what could be causing this, or parameters to try which could mitigate?

Thank you again!!

ajitjohnson commented 2 years ago

@yerahko can you post the commands you ran and also share a snippet of what your adata.obs looks like? Thank you.

yerahko commented 2 years ago

Hi @ajitjohnson, here is an example of code and what adata.obs looks like.

I am varying mostly radius/ knn, and lda_method, of the parameters below.

Thank you!

num_motifs=10
radius=20
knn=10
lda_method = 'radius' # 'knn' or 'radius' 
cluster_method='kmeans'

adata=sm.tl.spatial_lda(adata, num_motifs=num_motifs, radius=radius, knn=knn, method = lda_method, 
                        x_coordinate='X', y_coordinate='Y', phenotype='cellsimple', imageid='Unique_ID',
                       random_state=0)

adata=sm.tl.spatial_cluster(adata, random_state=0, df_name='spatial_lda',
                            method=cluster_method, k = 10 )

def voronoi_plots(unique_ID, color_by = 'spatial_kmeans'):  

    selected = adata[adata.obs['Unique_ID'].isin([unique_ID]), :].copy() 

    # sm.pl.voronoi(adata=adata, imageid='Unique_ID', subset = unique_ID,  # breaking; use line above for selecting subset 
    sm.pl.voronoi(adata=selected,
                color_by=color_by, x_coordinate='X', y_coordinate='Y')
    plt.show()

for i in pd.unique(adata.obs['Unique_ID']).categories: 
    print(i)
    voronoi_plots(color_by = 'cellsimple', unique_ID = i)
    voronoi_plots(color_by = 'spatial_%s' % cluster_method, unique_ID = i)  #  voronoi_plots(color_by = 'spatial_kmeans', unique_ID = i)
X Y Area cellsimple Patient_ID ROI_ID Unique_ID leiden spatial_kmeans
1.400000 294.500000 0.047619 B PT5 ROI_9 PT5.ROI_9 1 2
1.428571 329.380952 0.057143 B PT5 ROI_9 PT5.ROI_9 1 2
1.764706 345.529412 0.180952 Unknown PT5 ROI_9 PT5.ROI_9 22 2
2.300000 634.600000 0.047619 B PT5 ROI_9 PT5.ROI_9 1 4
2.096154 219.807692 0.352381 Unknown PT5 ROI_9 PT5.ROI_9 1 2

79308 rows × 9 columns

ajitjohnson commented 2 years ago

Hi @yerahko It all looks good to me. This issue has never occurred to me previously. I just want to confirm if your Unique_ID is unique to each image/ROI (It looks like it but just want to confirm). Just to give some background, each unique category within Unique_ID is processed independently in a dataset with multiple images. So If you have multiple ROIs within a single image, each ROI should be considered as an individual image for this purpose.

If that is what you did, not sure what else is going on and might need some example data from you to debug it.

yerahko commented 2 years ago

Hi @ajitjohnson yup, that is how I'm using Unique_ID. Each value corresponds to a single image.

Instead of _spatialLDA, I ran _spatialcount -> _spatialcluster and that workflow did run successfully without similar artifacts, so for the time being we will work with the _spatialcount results.

Thank you for your work on creating and maintaining this package—it's been a great tool for us!

ajitjohnson commented 2 years ago

Weird, if you would like me to debug, feel free to send me a subset of the data later on. Glad you are enjoying it :)