constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
249 stars 34 forks source link

Error in autoEstCont #93

Closed twoneu closed 2 years ago

twoneu commented 2 years ago

Hello,

Thank you so much for this great resource. I am encountering the following error when trying to run autoEstCont: Error in h(simpleError(msg, call)) : error in evaluating the argument 'x' in selecting a method for function 't': subscript out of bounds

Below is the code I am trying to run:

toc <- Read10X_h5("filtered_feature_bc_matrix.h5", use.names = T, unique.features = T)
tod <- Read10X_h5("raw_feature_bc_matrix.h5", use.names = T, unique.features = T)

#Create SoupX object
sc = SoupChannel(tod, toc)
sc = setClusters(sc, setNames(seur_obj@meta.data$seurat_clusters, rownames(seur_obj@meta.data)))

#Auto estimate contamination fraction
sc = autoEstCont(sc)

Thank you again!

constantAmateur commented 2 years ago

I can't see an obvious issue here. If you want to send me your inputs I can debug it locally. To do this run

saveRDS(list(toc=toc,tod=tod,clust= setNames(seur_obj@meta.data$seurat_clusters, rownames(seur_obj@meta.data)),'soupX_debug.RDS')
Close-your-eyes commented 2 years ago

Hi,

I ran into the same error and investigated the issue: In my case, the error comes from line 52 in autoEstCont: ute = t(ute[m,,drop=FALSE])

Problem is that some entries of m are NA. This seems to be due the fact that I am running SoupX on a subset of cells from a merged Seurat-object. This subset does not have cells in every of the clusters which have been annotated based on all cells in the Seurat object. This is why match(rownames(ssc$metaData),sc$metaData$clusters) in line 51 of autoEstCont causes NAs.

I may change my procedure (the clustering). Other than that, do you think it is worth to fix this issue?

Yours, Chris

constantAmateur commented 2 years ago

Hi Chris,

People seem to have issues relating to setting of clusters often enough that it is worth adding some checks to the code. However, I'm struggling to see how your use case would cause things to fail. If you only run SoupX on a subset of cells in a Seurat object, you should only pull across cluster IDs relating to those cells that you use.

I suspect the issue in your case is more subtle, perhaps due to clusters being a factor when you run setClusters? Hopefully it should be resolved by the next version of SoupX, but if not feel free to reopen this issue.