satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.31k stars 919 forks source link

FindClusters() error for larger datasets #56

Closed alexcharney closed 7 years ago

alexcharney commented 7 years ago

Hi Seurat team,

I am running through the steps in the pbmc-tutorial.Rmd file on a dataset from the dropseq paper (~49k cells). At the FindClusters() step I encountered the following error:

> mydat <- FindClusters(mydat, pc.use = 1:10, resolution = 2, print.output = 0, save.SNN = T) Error in which(SNN != 0, arr.ind = TRUE) : long vectors not supported yet: ../../src/include/Rinlinedfuns.h:138

As a sanity check, I ran on a few different resolution thresholds (0.6, 1.2, 2), but encountered error regardless of threshold. I ran on a smaller dataset (~10K cells) and did not encounter the error. I am wondering if you have encountered this before for larger datasets, and if so would be grateful for any troubleshooting suggestions.

(using R version 3.3.1 on x86_64-pc-linux-gnu platform)

andrewwbutler commented 7 years ago

For larger datasets, you're going to want to have do.sparse = TRUE in the FindClusters function call.

mjsteinbaugh commented 7 years ago

Thanks for this post. I just ran into the same problem with a large dataset on our HPC cluster as well. I'll re-run FindClusters() with do.sparse = TRUE enabled and report back.

Update: Using do.sparse = TRUE fixed the issue for me.