dviraran / SingleR

SingleR: Single-cell RNA-seq cell types Recognition (legacy version)
GNU General Public License v3.0
271 stars 98 forks source link

MCA-dataset tutorial error #15

Closed alexandruioanvoda closed 5 years ago

alexandruioanvoda commented 5 years ago

Hi! 've got the following error when running the following tutorial (http://comphealth.ucsf.edu/sample-apps/SingleR/SingleR.MCA.html) line-by-line:

> for (i in s) {
+   print(i)
+   A = seq(i,min(i+20000-1,length(mca@cell.names)))
+   annot=mca@meta.data$Tissue[A]
+   names(annot) = rownames(mca@meta.data)[A]
+
+   singler = CreateSinglerObject(mca@raw.data[,A], annot = annot, project.name='MCA',
+                                 min.genes = 0,  technology = "Microwell-Seq",
+                                 species = "Mouse", citation = "Han et al. 2018",
+                                 do.signatures = F, clusters = mca@ident[A])
+
+   save(singler,file=paste0('/gfs/work/avoda/repli_mca/MCA/singler.partial.mca.',i,'.RData'))
+ }
[1] 1
[1] "Dimensions of counts data: 39855x20000"
[1] "Annotating data with Immgen..."
[1] "Variable genes method: de"
[1] "Number of DE genes:4171"
[1] "Number of cells: 20000"
Error in foreach(i = 0:length(s)) %dopar% { :
  could not find function "%dopar%"

Seems to be caused by not requiring some parallelism libraries before (https://stackoverflow.com/questions/33250475/r-could-not-find-function-dopar)

dviraran commented 5 years ago

Not sure. Its working for me. I added foreach to the namespace. You can try it now.

dviraran commented 5 years ago

Also note that I added a function - CreateBigSingleRObject - that you can use instead of this code.

alexandruioanvoda commented 5 years ago

Dear Dr. Dvir Aran,

Thank you for your reply!

Managed to surpass the foreach error by installing Rmpi on the cluster with these instructions: https://web.archive.org/web/20160620011930/http://jovingelabsoftware.github.io/blog/2016/02/15/installing-openmpi-and-rmpi-from-source/ (attached in case anybody else has the same issue on their HPC)

However, just wanted to point an error out on the CreateBigSingleRObject():

The number of cores to use is mistakenly not passed to the underlying function: https://github.com/dviraran/SingleR/blob/0271066668c50f2e0eced771ff8254e22c51c788/R/SingleR.Create.R#L740

Even though I gave it 31 cores to work on, it only uses 16

Estimating ssGSEA scores for 5 gene sets. | | 0%Using parallel with 16 cores |======================================================================| 100% Estimating ssGSEA scores for 5 gene sets. | | 0%Using parallel with 16 cores |====== | 8%

dviraran commented 5 years ago

Thanks! Fixed it.

-- Dvir

alexandruioanvoda commented 5 years ago

Thanks! But now (with the updated version) the ssGSEA call is still using the hard-coded # of cores (16 instead of 70)


[1] "Annotating data with HPCA..."
[1] "Variable genes method: de"
[1] "Number of DE genes:4394"
[1] "Number of cells: 5620"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Number of DE genes:4394"
[1] "Number of clusters: 10"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Annotating data with HPCA (Main types)..."
[1] "Number of DE genes:3305"
[1] "Number of cells: 5620"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Number of DE genes:3305"
[1] "Number of clusters: 10"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Annotating data with Blueprint_Encode..."
[1] "Variable genes method: de"
[1] "Number of DE genes:3791"
[1] "Number of cells: 5620"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Number of DE genes:3791"
[1] "Number of clusters: 10"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Annotating data with Blueprint_Encode (Main types)..."
[1] "Number of DE genes:3220"
[1] "Number of cells: 5620"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Number of DE genes:3220"
[1] "Number of clusters: 10"
[1] "Fine-tuning round on top cell types (using 70 CPU cores):"
[1] "Using sets of 1000 cells. Running 6 times."
Estimating ssGSEA scores for 5 gene sets.
  |                                                                      |   0%Using parallel with 16 cores
  |======================================================================| 100%
Estimating ssGSEA scores for 5 gene sets.
  |                                                                      |   0%Using parallel with 16 cores
  |====                                                                  |   6%```
alexandruioanvoda commented 5 years ago

Probably because SingleR.numCores isn't passed to this function: https://github.com/dviraran/SingleR/blob/293653c2ad129e6f8f15f0e5a2c5d6d347c5e51d/R/SingleR.Create.R#L460

dviraran commented 5 years ago

Thanks. Fixed.