YosefLab / scone

53 stars 12 forks source link

BiocParallel error in scone(run=TRUE): attempt to apply non-function #98

Closed aarzalluz closed 6 years ago

aarzalluz commented 6 years ago

When using scone() in run = TRUE mode, I'm getting the following error:

Error in scaling[[sc_params[i, 2]]](imputed[[sc_params[i, 1]]]) : attempt to apply non-function

I've researched a bit and it seems to arise from some BiocParallel error, because presumably the workers in the parallel computation are unable to access the functions in the global environment... But this is just speculation, because I had previously run BiocParallel::register(BiocParallel::SerialParam()) , which thought would solve exactly that. I don't know how to get around the error, or what I could be missing.

I'm making the following call to scone(), and my dataset consists in ~12K features and 1725 cells.

scone <- scone(scone, scaling = scaling,
               k_qc = 0, k_ruv = 3, 
               adjust_bio = "no",
               run = TRUE, verbose = TRUE,
               eval_kclust = 2:7, stratified_pam = TRUE, stratified_cor = TRUE,
               return_norm = "in_memory", zero = "postadjust")

And here's my sessionInfo():

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=es_ES.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=es_ES.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocInstaller_1.28.0       scales_0.5.0               edgeR_3.21.6               limma_3.34.9              
 [5] SCnorm_1.0.0               DESeq2_1.18.1              scone_1.3.0                scran_1.6.9               
 [9] BiocParallel_1.12.0        scater_1.6.3               ggplot2_2.2.1              SingleCellExperiment_1.0.0
[13] SummarizedExperiment_1.8.1 DelayedArray_0.4.1         matrixStats_0.53.1         Biobase_2.38.0            
[17] GenomicRanges_1.30.3       GenomeInfoDb_1.14.0        IRanges_2.12.0             S4Vectors_0.16.0          
[21] BiocGenerics_0.24.0       

loaded via a namespace (and not attached):
  [1] backports_1.1.2          Hmisc_4.1-1              aroma.light_3.8.0        igraph_1.1.2             plyr_1.8.4              
  [6] lazyeval_0.2.1           shinydashboard_0.7.0     splines_3.4.4            digest_0.6.15            htmltools_0.3.6         
 [11] viridis_0.5.0            gdata_2.18.0             checkmate_1.8.5          magrittr_1.5             memoise_1.1.0           
 [16] cluster_2.0.7-1          mixtools_1.1.0           Biostrings_2.46.0        annotate_1.56.1          bayesm_3.1-0.1          
 [21] R.utils_2.6.0            rARPACK_0.11-0           prettyunits_1.0.2        colorspace_1.3-2         blob_1.1.0              
 [26] dplyr_0.7.4              RCurl_1.95-4.10          tximport_1.6.0           hexbin_1.27.2            genefilter_1.60.0       
 [31] bindr_0.1                survival_2.42-3          zoo_1.8-1                glue_1.2.0               gtable_0.2.0            
 [36] zlibbioc_1.24.0          XVector_0.18.0           MatrixModels_0.4-1       compositions_1.40-1      kernlab_0.9-25          
 [41] prabclus_2.2-6           DEoptimR_1.0-8           SparseM_1.77             DESeq_1.30.0             mvtnorm_1.0-7           
 [46] DBI_0.7                  Rcpp_0.12.17             htmlTable_1.11.2         viridisLite_0.3.0        xtable_1.8-2            
 [51] progress_1.1.2           foreign_0.8-70           bit_1.1-12               mclust_5.4               Formula_1.2-2           
 [56] DT_0.4                   htmlwidgets_1.0          httr_1.3.1               FNN_1.1                  gplots_3.0.1            
 [61] RColorBrewer_1.1-2       fpc_2.1-11               acepack_1.4.1            modeltools_0.2-21        pkgconfig_2.0.1         
 [66] XML_3.98-1.10            R.methodsS3_1.7.1        flexmix_2.3-14           nnet_7.3-12              locfit_1.5-9.1          
 [71] dynamicTreeCut_1.63-1    rlang_0.2.0              reshape2_1.4.3           AnnotationDbi_1.40.0     munsell_0.4.3           
 [76] tools_3.4.4              moments_0.14             RSQLite_2.0              stringr_1.3.0            yaml_2.1.16             
 [81] knitr_1.20               bit64_0.9-7              robustbase_0.92-8        caTools_1.17.1           bindrcpp_0.2            
 [86] EDASeq_2.12.0            quantreg_5.35            mime_0.5                 R.oo_1.22.0              biomaRt_2.34.2          
 [91] rstudioapi_0.7           compiler_3.4.4           beeswarm_0.2.3           tibble_1.4.2             statmod_1.4.30          
 [96] geneplotter_1.56.0       stringi_1.1.6            GenomicFeatures_1.30.3   RSpectra_0.12-0          lattice_0.20-35         
[101] trimcluster_0.1-2        Matrix_1.2-14            tensorA_0.36             pillar_1.1.0             data.table_1.10.4-3     
[106] bitops_1.0-6             httpuv_1.3.5             rtracklayer_1.38.3       R6_2.2.2                 latticeExtra_0.6-28     
[111] hwriter_1.3.2            RMySQL_0.10.14           ShortRead_1.36.1         KernSmooth_2.23-15       gridExtra_2.3           
[116] vipor_0.4.5              boot_1.3-20              energy_1.7-2             MASS_7.3-49              gtools_3.5.0            
[121] assertthat_0.2.0         rhdf5_2.22.0             rjson_0.2.15             RUVSeq_1.12.0            GenomicAlignments_1.14.1
[126] Rsamtools_1.30.0         GenomeInfoDbData_1.0.0   diptest_0.75-7           rpart_4.1-13             grid_3.4.4              
[131] class_7.3-14             segmented_0.5-3.0        base64enc_0.1-3          shiny_1.0.5              ggbeeswarm_0.6.0    

Thanks,

Ángeles

drisso commented 6 years ago

What does your scaling vector look like?

The error tells you that you are trying to apply a non-function, which might mean that you have one or more non-functions in the scaling vector. Or somewhere else.

It would be helpful to know the output of scone() with run=FALSE as well.

aarzalluz commented 6 years ago

Sorry I forgot to include that info. I checked, and seem to have failed to define custom scaling functions correctly. Once I removed them from the scaling vector and kept only the scone wrappers, I don't get the error anymore.

So... How would I go about that? Say I intended to include the SCnorm method as a user defined function for testing. I tried the following, and sourced the function into the global environment:

SCNORM_FN <- function(ei){
  eo <- SCnorm(ei)

  return(eo)
}

Also, I assume there is no way to pass arguments to the SCnorm function? I tried this, but guess it's not possible:

SCNORM_FN2 <- function(ei){
  eo <- SCnorm(ei, Conditions = as.character(scone$bio),
         useSpikes = TRUE)
  return(eo)
}
drisso commented 6 years ago

I would think that both your custom functions should work, @mbcole any idea why they're not working?

mbcole commented 6 years ago

First - this probably doesn't explain the error, but it looks like the current version of SCnorm has a list output value. I think you want to return SCnorm::results(eo) to return a matrix - this is a requirement of the scaling argument.

Other than that, I agree with @drisso that there shouldn't be anything wrong with the functions you defined in your global environment (before calling bplapply): if you register MulticoreParam as your BiocParallel back end, then you should be able to find any functions or objects from the environment you called bplapply, including the initialized scone object stored there.

My best guess is that the scaling argument is malformed, since the error was not

could not find function "SCNORM_FN"

but rather

attempt to apply non-function

The scaling argument should generally have the form of a named list of functions:

scaling = list(none = identity, sum = SUM_FN, scnorm = SCNORM_FN)

Can you share your scaling argument?

aarzalluz commented 6 years ago

I defined scaling as follows:

scaling <- list(none = identity,
                sum = SUM_FN,
                tmm = TMM_FN,
                uq = UQ_FN,     
                fq = FQT_FN,
                deseq = DESEQ_FN,
                scnorm = SCNORM_FN)

Where SCNORM_FN is:

SCNORM_FN <- function(ei){
  eo <- SCnorm(ei)

  return(eo)
}

I changed the BiocParallel back end to MulticoreParam intead of SerialParam, as you suggested, and got the error you mentioned:

Error: BiocParallel errors element index: 7 first error: could not find function "SCnorm"

I then changed SCNORM_FN to return(results (eo)), which should output the normalized matrix only, instead of the list output. Will run that now, and see...

mbcole commented 6 years ago

Ok - that's good because now SCNORM_FN is being applied within scone, just SCnorm isn't recognized. Have you run library(SCnorm) in advance of running scone(...)?

Also - to clarify - you get the non-function error whenever you switch back to SerialParam?

aarzalluz commented 6 years ago

Weirdly, switching to SerialParam does not produce the non-function error anymore... Which is strange, because having loaded SCnorm and defined the SCNORM_FN functions as above before, I did get the error with SerialParam. The only difference I can think of was in the results(eo) line to return a matrix... Sorry I cannot reproduce the error anymore. Somehow, scone() is now working.

Thanks for your help :)

mbcole commented 6 years ago

Glad you got it working! I'll close the issue for now, but please let us know if you encounter a similar bug again.