kstreet13 / slingshot

Functions for identifying and characterizing continuous developmental trajectories in single-cell data.
258 stars 42 forks source link

Error running slingshot: system is computationally singular #210

Closed sopenaml closed 1 year ago

sopenaml commented 1 year ago

Hi, I'm running slingshot in a dataset with 95000 cells. When I subset my data into lineages to then run TradeSeq for each slingshot lineage. I run into a problem:

sce <- as.SingleCellExperiment(seurat )
reducedDim(sce, "PCA") <-  Embeddings(seurat, reduction = "pca")
sce <- slingshot(sce, clusterLabels = 'seurat_clusters', reducedDim = 'PCA', start.clus = 0)
pt <- slingPseudotime(sce) 

seurat <- AddMetaData( seurat, 
                        metadata = pt,
                        col.name = colnames(pt))
#subset seurat object by lineage
lin1 <- subset(seurat, subset = Lineage1 != "NA")
DefaultAssay(lin1) <- "RNA"
 sce <- as.SingleCellExperiment(lin1 )
# get PCA info an add to reducedDim slot in sce
reducedDim(sce, "PCA") <-  Embeddings(lin1, reduction = "pca")
 slingshot(sce,clusterLabels = 'seurat_clusters', 
                reducedDim = 'PCA', 
                start.clus = 0)
Error in solve.default(s1 + s2) : 
  system is computationally singular: reciprocal condition number = 2.38147e-20

I have looked at previous issues https://github.com/kstreet13/slingshot/issues/35 and https://github.com/kstreet13/slingshot/issues/87 and using the code in the latter, I have checked if my clusters are highly linear, if I understood the thread properly, but that's not my case see example included for one cluster. clusters

I have also tried to add dist_clusters_diag to dist.fun with no success.

 slingshot(sce,clusterLabels = 'seurat_clusters', 
                reducedDim = 'PCA', 
                start.clus = 0,  dist.fun= dist_clusters_diag)
Error in solve.default(s1 + s2) : 
  system is computationally singular: reciprocal condition number = 2.38147e-20

I have clusters with few cells or not cells at all: table(lin1@meta.data$seurat_clusters)

    0     1     2     3     4     5     6     7     8     9    10    11    12 
19693  9550  8648  1216  6309     6   388  4545   340  1875   505   198    51 
   13    14 
  137     0 

Do you think this is the problem, if so, should I subset my data on clusters belonging to one lineage or using the Lineage values generated by slingshot? Why would Lineage1 ( composed of cluster 0,2,9,1,4,7) have cells in other clusters? is this expected or do you think there's something wrong with my data? Apologies for a long thread and thank you for your time. Miriam

kstreet13 commented 1 year ago

Hi Miriam,

Thanks for submitting and I appreciate you taking the time to look at some related issues. If I had to guess, I would say it's probably the small clusters that are causing issues (specifically, when Slingshot tries to calculate the distance between cluster 5 and cluster 14, there are only 6 cells present in what looks to be at least 10 dimensions, so it won't be able to use its standard method).

That said, I think you should be able to avoid this issue entirely. I actually wouldn't recommend subsetting the data before running tradeSeq, as tradeSeq is designed to work with (standard) Slingshot output. I'm also not sure why you are running Slingshot a second time on the subset data? Anyway, after you run Slingshot the first time, you should be able to go straight into running fitGAM on the same SingleCellExperiment object.

Also, some minor points: I'm pretty sure as.SingleCellExperiment automatically converts the reduced dimensional spaces. And you don't need to convert back to a Seurat object in order to subset (SingleCellExperiment objects can be subset just like matrices, so you could achieve the same thing with sce <- sce[, !is.na(sce$slingPseudotime_1)]).

Let me know if I'm missing something or if you have any follow-up issues. Kelly

kstreet13 commented 1 year ago

Closing due to lack of response. Feel free to re-open.

saphir746 commented 2 months ago

Had a similar problem with a subsetted Seurat object where some clusters had < 10 cells. Subsetted out clusters with 10 cells or less, then it worked

discard.pile<-Seur_obj@meta.data$seurat_clusters%>% 
  table() %>% 
  as.data.frame() %>% 
  dplyr::rename(.,Clust=.) %>% filter(Freq<10)
Discard<-discard.pile$Clust %>% as.character() %>% intersect(Idents(Seur_obj) %>% levels())
Seur_obj <-subset(Seur_obj, idents=Discard, invert=TRUE)