dviraran / SingleR

SingleR: Single-cell RNA-seq cell types Recognition (legacy version)
GNU General Public License v3.0
271 stars 98 forks source link

CreateBigSingleRObject error #33

Closed danshu closed 5 years ago

danshu commented 5 years ago

Hi,

I can run SingleR using the following commands: singler = CreateSinglerObject(seu@data, project.name=pname, min.genes=as.integer(nGene),technology="10X",species=species, normalize.gene.length=F,variable.genes="de",fine.tune=T,do.signatures=T,do.main.types=T,reduce.file.size = T,numCores = 30) singler$meta.data$xy = seu@dr$umap@cell.embeddings # the UMAP coordinates singler$meta.data$clusters = as.character(seu@meta.data$DBclust.ident) However, because some samples have a large number of cells, I then run SingleR using "CreateBigSingleRObject": singler = CreateBigSingleRObject(seu@data, annot=NULL, xy=seu@dr$umap@cell.embeddings, clusters=as.character(seu@meta.data$DBclust.ident),project.name=pname,min.genes=as.integer(nGene), technology="10X", species=species, normalize.gene.length=F,variable.genes="de",fine.tune=T,reduce.file.size=T,do.signatures=T,do.main.types=T,temp.dir=getwd(), numCores = 30) Then I'm getting this error: Error in singler.list[[i]] : subscript out of bounds Calls: CreateBigSingleRObject -> SingleR.Combine In addition: Warning messages: 1: In .local(expr, gset.idx.list, ...) : 4553 genes with constant expression values throuhgout the samples. 2: In .local(expr, gset.idx.list, ...) : 5227 genes with constant expression values throuhgout the samples. 3: In .local(expr, gset.idx.list, ...) : 5094 genes with constant expression values throuhgout the samples. 4: In .local(expr, gset.idx.list, ...) : 5696 genes with constant expression values throuhgout the samples. Execution halted

Does this error results from running "CreateBigSingleRObject" on samples with less than 10000 cells?

dviraran commented 5 years ago

Seems like its failing in running SingleR.Combine, which is the last line in this function. The objects seem to have created successfully. Try running the last section from the CreateBigSingleRObject function to see what exactly is going wrong.

singler.objects.file <- list.files(paste0(temp.dir,'/singler.temp/'), 
                                     pattern='RData',full.names=T)

  singler.objects = list()
  for (i in 1:length(singler.objects.file)) {
    load(singler.objects.file[[i]])
    singler.objects[[i]] = singler
  }

  singler = SingleR.Combine(singler.objects,order = colnames(counts), 
                            clusters=clusters,xy=xy)
danshu commented 5 years ago

In singler.temp, there is only a single file named "project_name.1.RData" for this project and another RData file for another project. The previous error is reproduced by running: load("./singler.temp/project_name.1.RData") singler.objects = list() singler.objects[[1]] = singler singler = SingleR.Combine(singler.objects,order = colnames(counts)) Error in singler.list[[i]] : subscript out of bounds

dviraran commented 5 years ago

well, use a different 'temp.dir' for each project.

danshu commented 5 years ago

I have two samples that have more than 20000 cells. For one sample, all subobjects were created successfully although its failing in running SingleR.Combine. I can manually combine those objects. For the other sample, it seems failed at creating the second singler object. So I decided to manually divide my samples into subsets and run singleR for each one. sce <- readRDS(infile) singler = CreateSinglerObject(logcounts(sce), project.name=pname, min.genes=as.integer(nGene),technology="10X",species=species, normalize.gene.length=F,variable.genes="de",fine.tune=T,do.signatures=T,do.main.types=T,reduce.file.size = T,numCores = 30) saveRDS(singler,file=paste0(outprefix,".SingleR.nGene",nGene,".RData")) However, I filed to load the resulting singleR objects. load("Batch2.1.SingleR.nGene200.RData") Error in load("Batch2.1.SingleR.nGene200.RData") : bad restore file magic number (file may be corrupted) -- no data loaded In addition: Warning message: file ‘Batch2.1.SingleR.nGene200.RData’ has magic number 'X' Use of save versions prior to 2 is deprecated

dviraran commented 5 years ago

To load an rds file (save by saveRDS) you need to use readRDS.

danshu commented 5 years ago

Thanks!