satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 912 forks source link

BuildNicheAssay on multiple images (request + workaround) #8545

Open swbioinf opened 8 months ago

swbioinf commented 8 months ago

Hi,

I've been wanting to identify cellular niches within my cosmx dataset - which has two slides (images/ or 'fovs' in seurat's examples).

The 'BuildNicheAssay' function expects an fov to be supplied. But I want niches to be calculated across both images together.

It looks like that the fov is needed used to get the tissue coordinates to calculate the spatial neighbour network. I hacked up a version of that function (below) that builds a number of spatial networks then glues them all together (making a disjoint network where cells on different slides are never adjacent.).

My code is not great, but the general approach seems to work. Any chance of such a modification making it into the main Seurat function?

This issue is also described here: https://github.com/satijalab/seurat/issues/8126

Thanks


# cells on different slides will never be neighbours. (take that 3d nonsense elshewre.)
merge_neighbours <- function (neighbors1, neighbors2) {
  # in the main objects, onlysnn is a Graph object. I wonder if nn is only accidentally not.
  neighbors <- list()
  neighbors$nn <- merge_non_overlapping_sparse_matricies(neighbors1$nn, neighbors2$nn)
  neighbors$snn <- SeuratObject::as.Graph(merge_non_overlapping_sparse_matricies(neighbors1$snn, neighbors2$snn) ) 
  return(neighbors)
}

merge_non_overlapping_sparse_matricies <- function(n1, n2) {
  # I'm fairly sure there's a better  function to do this.
  # N1 + N2 => 
  #
  # n1 X
  # X n2
  #

  filler_top_right   <- SparseEmptyMatrix(nrow     = nrow(n1),     ncol     = ncol(n2), 
                                          rownames = rownames(n1), colnames = colnames(n2) )
  filler_bottom_left <- SparseEmptyMatrix(nrow     = nrow(n2),     ncol     = ncol(n1),
                                          rownames = rownames(n2), colnames = colnames(n1) )
  n <-   rbind( cbind(n1,                  filler_top_right), 
                cbind(filler_bottom_left, n2              ) )
  return(n)
}

# Modified version of Seurats BuildNicheAssay that runs acros all fovs at once.
BuildNicheAssay.using_all_fovs <- function(
  object,
  group.by,
  assay = "niche",
  neighbors.k = 20,
  niches.k = 4
) {

  # empty
  neighbors.all <- list()
  neighbors.all$nn  <- SparseEmptyMatrix(0,0)
  neighbors.all$snn <- SparseEmptyMatrix(0,0)

  for (fov in names(object@images)) {
    coords <- GetTissueCoordinates(object[[fov]], which = "centroids")
    cells <- coords$cell

    rownames(coords) <- cells
    coords <- as.matrix(coords[ , c("x", "y")])
    neighbors    <- FindNeighbors(coords, k.param = neighbors.k)
    neighbors$nn <- neighbors$nn[Cells(object[[fov]]), Cells(object[[fov]])]

    # Merge with previous
    neighbors.all <- merge_neighbours(neighbors.all, neighbors) 
  }
  # rename
  neighbors <- neighbors.all

  # Need a list of all those cells, used later.
  # put all cells into one list (being sure order matches network, might not match object order.)
  cells <- colnames(neighbors$nn)

  # Continuouing on the BuildNicheAssay function
  # build cell x cell type matrix
  ct.mtx <- matrix(
    data = 0,
    nrow = length(cells),
    ncol = length(unlist(unique(object[[group.by]])))
  )
  rownames(ct.mtx) <- cells
  colnames(ct.mtx) <- unique(unlist(object[[group.by]]))
  cts <- object[[group.by]]
  for (i in 1:length(cells)) {
    ct <- as.character(cts[cells[[i]], ])
    ct.mtx[cells[[i]], ct] <- 1
  }

  # create niche assay
  sum.mtx <- as.matrix(neighbors$nn %*% ct.mtx)
  niche.assay <- CreateAssayObject(counts = t(sum.mtx))
  object[[assay]] <- niche.assay
  DefaultAssay(object) <- assay

  # cluster niches assay
  object <- ScaleData(object)
  results <- kmeans(
    x = t(object[[assay]]@scale.data),
    centers = niches.k,
    nstart = 30
  )
  object$niches <- results[["cluster"]]

  return(object)

}
zskylarli commented 7 months ago

Hi - glad you were able to find a solution! Just out of curiosity, do you think you could achieve something similar if you ran BuildNicheAssay on the images separately, joined those niche assays, and then ran FindClusters on that joined assay? Since niches are calculated based on spatial proximity, it doesn't make much sense to me to try to join 2 FOVs unless you expect larger neighborhoods across them, but if you could expand that would be great!

swbioinf commented 7 months ago

Hi @zskylarli ah yes, that does sound more efficient! (and avoids the adjacentcy patchwork quilt!). Counts of neighbourhood cell types can be calculated independanly per layer, as you say its only the clustering step where we need to pool it.

Have moved on with my analysis for now though, If I get the chance to tidy that up next time I'll drop an update here. (or do a pull request if I get keen).

Sarah.

andrewsong777 commented 7 months ago

Hello! I have a single seurat object with 13 different tissues. I am wondering if I can just run BuildNicheAssay on this single object (which runs successfully) or if I need to run 13 separate BuildNicheAssays separately as described above and then combine?

swbioinf commented 7 months ago

Hi @andrewsong777 Interesting. Where these run on 13 different slides/runs? ie. Are the 13 images or one? object@images

My issue was for a few tissue samples spread over two slides (so two images confusingly called 'fov' and 'fov.1' -, but with multiple fovs on each (NB: What seurat calls an fov is different to what nanostring calls a fov)). I wrote this workaround to handle that.

If the current function works for you then great - maybe your object is built differently? (For my interest - how did you prepare it? How are the global x/y coordinates on different slides handled?)