welch-lab / liger

R package for integrating and analyzing multiple single-cell datasets
GNU General Public License v3.0
380 stars 78 forks source link

Bug in Seurat's wrapper function #292

Open soheilj opened 10 months ago

soheilj commented 10 months ago

Hi,

I think the following part in the RunOptimizeALS() SeuratWrapper function has potentially a bug.

   split.cells <- split(x = colnames(x = object), f = split.by)
   scale.data <- lapply(X = split.cells, FUN = function(x) {
      return(t(x = GetAssayData(object = object, slot = "scale.data",
                                assay = assay)[, x]))
   })
   out <- rliger::optimizeALS(object = scale.data, k = k, lambda = lambda,
                              thresh = thresh, max.iters = max.iters, nrep = nrep,
                              H.init = H.init, W.init = W.init, V.init = V.init, rand.seed = rand.seed,
                              print.obj = print.obj)
   colnames(x = out$W) <- VariableFeatures(object = object)

Fetching the scaled data matrix from the Seurat object without specifying the gene order would not necessarily have the same order of genes as appears in the VariableFeatures() function. So in the last line, gene names can be assigned to column names of the out$W matrix in an incorrect order.

I think adding the gene order after the GetAssayData() would be an easy fix.

   scale.data <- lapply(X = split.cells, FUN = function(x) {
      return(t(x = GetAssayData(object = object, slot = "scale.data",
                                assay = assay)[VariableFeatures(object = object), x]))
   })
mvfki commented 5 months ago

That's a great catch!

We just made a new release of rliger package 2.0. And we are abandoning the seurat wrappers since we allowed the native Seurat object support in the new version. I.e. you can directly call a rliger function using a Seurat object now. The rliger wrapper functions in SeuratWrappers package is not yet updated accordingly but we are planning on deprecating them.

For the specific point you're making, I'm very sure that now the feature names come directly from the scaled matrix objects. And it doesn't matter if you use a Seurat or liger object.

Yichen