JinmiaoChenLab / SpatialGlue

SpatialGlue is a novel deep learning methods for spatial multi-omics data integration.
GNU Affero General Public License v3.0
41 stars 5 forks source link

Questions about calculating two unsupervised metrics #12

Open AIYUE1211 opened 3 months ago

AIYUE1211 commented 3 months ago

Dr. Yahui Long,

Sorry to bother you. I have a couple of questions and issues regarding the calculation of two unsupervised metrics. I have read your SpatialGlue paper and code, but I'm encountering some discrepancies between my results and those presented in your paper when implementing the unsupervised metrics of Moran's I and Jaccard Similarity. I'm writing to respectfully inquire if there are any specific precautions or nuances that I should be aware of when calculating these metrics. I've outlined below how I'm currently calculating Moran's I scores and Jaccard Similarity, and I would greatly appreciate any guidance or insights you can provide.

  1. moran I:
    I have adopted the implementation from Squidpy, utilizing the integrated embeddings and spatial coordinates of each cluster as input. I have set the n_neighs parameter to either 4 or 6, depending on the data type, and my code snippet is as follows:

    moranI_scores_per_cluster = {} for cluster in unique_clusters: sub_adata = adata[adata.obs['SG'] == cluster, :].copy() sub_adata.obsm['spatial'] = adata[sub_adata.obs_names, :].obsm['spatial'].copy()

    sq.gr.spatial_neighbors(sub_adata, n_neighs=n_neighs, coord_type='grid', n_rings=1)
    sq.gr.spatial_autocorr(sub_adata, mode='moran', transformation=True, seed=2022)
    moranI_scores = sub_adata.uns["moranI"]['I'].mean(axis=0)
    
    moranI_scores_per_cluster[cluster] = moranI_scores

Regarding moran I, I kindly request your confirmation on whether this calculation process is correct. and if there are any additional precautions or considerations I should be aware of when selecting and adjusting the parameters?

  1. Jaccard similarity: Based on the formula provided in your supplementary file, I have attempted to compute the Jaccard similarity using the integrated embedding and “ modality data” (I use SpatialGlue input data, that is, data after pca processing). The output is the jaccard similarity of all spots. I take the average of all the spots as the final result. May I ask if the calculation is correct? What is the input modal data, Is it the pca input of the model or other data? Additionally, I would like to know if there are any other details I should pay attention to during this calculation.

Thank you very much for your time and assistance.

Enderlogic commented 1 month ago

Hi,

I'm facing the same problem as you. Can you reproduce the results of the unsupervised metrics presented in the authors' paper?

HelloWorldLTY commented 1 month ago

Hi,

I'm facing the same problem as you and raised it couple of weeks ago. It will be great if you can share your thoughts. Thanks.