saezlab / visium_heart

Spatial transcriptomics of heart tissue
GNU General Public License v3.0
72 stars 36 forks source link

can you please elaborate how the cell type proportions were calculated to define niches. #7

Closed deconvolute closed 2 years ago

deconvolute commented 2 years ago

Hi,

We were trying to understand your find_niche_ct.R script to find niches. We are not exactly sure how the matrix of composition it is generated. Can you please elaborate how the cell type proportions were calculated to define niches. It is possible for you to share the input files to run find_niche_ct.R script.

Thank you

roramirezf commented 2 years ago

Hi,

I copy the methods section from the paper hoping this helps:

"To identify groups of spots in the different samples that shared similar cell-type compositions, we transformed the estimated cell-type proportions of each spatial transcriptomics spot and slide into isometric log ratios (ILR)88, and clustered spots into groups. These niches represent groups of spots that are similar in cell composition and represent potential shared structural building blocks of our different slides; we refer to these groups of spots as cell-type niches. Louvain clustering of spots was performed by first creating a shared nearest neighbour graph with k different number of neighbours (10, 20, 50) using scran’s89 (v1.18.5) buildSNNGraph function. Then, we estimated the clustering resolution that maximized the mean silhouette score of each cluster. We assigned overrepresented cell types in each structure by comparing the distribution of cell-type compositions within a cell-type niche versus the rest using Wilcoxon tests (FDR < 0.05). We tested if a given cell state was more representative of a cell-type niche by performing Wilcoxon tests between each niche and the rest (FDR < 0.05). Only positive state scores were considered in this analysis."

In line 38 of the mentioned script you can see how I create compositions for each spot, by dividing each cell2location estimation per cell-type with the sum of all the scores:

Here, each matrix contains spots in rows, cells in colums

# Generates list of matrix of c2l converted proportions
list_matrices <- map2(c2l_df$c2l_file, c2l_df$sample, function(f, s) {
  mat <- readRDS(f)
  rownames(mat) <- paste0(s, "..", rownames(mat))
  prop_mat <- base::apply(mat, 1, function(x) {

    x/sum(x)

  })

  return(t(prop_mat))
})

If you want to look at the cell2location scores and proportions separately, I suggest you to download the slide objects provided in HCA or Zenodo :)

hope this helps