rietho / IPO

A Tool for automated Optimization of XCMS Parameters
http://bioconductor.org/packages/IPO/
Other
34 stars 20 forks source link

Uses FORK instead of PSOCK R clusters on Linux/Mac #35

Closed pcm32 closed 8 years ago

pcm32 commented 8 years ago

On unix platforms, R can use a different clustering mode called FORK, which is supposed to be less memory hungry than PSOCK. Changes on this pull request allow this mode to be used automatically by IPO if available on the platform, and warn the user if PSOCK is being used.

rietho commented 8 years ago

With FORK clustering on Red Hat Enterprise Linux I get the following error: Error in UseMethod("sendData") : no applicable method for 'sendData' applied to an object of class "c('forknode', 'SOCK0node')" I couldn't find any good resources about that on the internet. Do you have any suggestions?

pcm32 commented 8 years ago

@rietho which version of R are you using there?

pcm32 commented 8 years ago

I'm running this now on R 3.2.3 with RHEL 6.6, I don't see any issues yet at least in the peak picking (running with 4 cores).

rietho commented 8 years ago

I'm also running R 3.2.3, but on RHEL 7.2. Here's the session info:

R version 3.2.3 (2015-12-10)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] snow_0.4-1          IPO_1.7.4           CAMERA_1.26.0       igraph_1.0.1       
 [5] rsm_2.7-4           xcms_1.46.0         Biobase_2.30.0      ProtGenerics_1.2.1 
 [9] BiocGenerics_0.16.1 mzR_2.4.1           Rcpp_0.12.4        

loaded via a namespace (and not attached):
 [1] graph_1.48.0        Formula_1.2-1       cluster_2.0.3       magrittr_1.5       
 [5] splines_3.2.3       munsell_0.4.3       colorspace_1.2-6    lattice_0.20-33    
 [9] plyr_1.8.3          tools_3.2.3         nnet_7.3-12         grid_3.2.3         
[13] gtable_0.2.0        latticeExtra_0.6-28 survival_2.38-3     RBGL_1.46.0        
[17] gridExtra_2.2.1     RColorBrewer_1.1-2  ggplot2_2.1.0       acepack_1.3-3.3    
[21] codetools_0.2-14    rpart_4.1-10        scales_0.4.0        Hmisc_3.17-3       
[25] stats4_3.2.3        foreign_0.8-66   
rietho commented 8 years ago

OK, after lots of thinking, testing and asking Google I think that I got it now: The error seems to arise in connection with the package snow. snow is silently loaded by xcms, if it is installed. Once snow is loaded, there error posted above arises on calling parallel::stopCluster(cl).

Here's a minimal working example, which should reproduce the error:

library(snow)
cl <- parallel::makeCluster(2, type = "FORK")
data <- data.frame(x1 = 10, x2 = 5)
fun <- function(x, mean) rnorm(x)
parallel::parSapply(cl, data, fun)
parallel::stopCluster(cl)

Not loading snow (or alternatively unloading the namespace via unloadNamespace("snow") should not lead to an error.

@pcm32: Can you cornform that? I guess that you do not have snow installed, which explains why IPO runs without error on your system. I'd add unloadNamespace("snow") to the code, so that it should work for everybody again.

pcm32 commented 8 years ago

Thanks @rietho I can indeed reproduce what you suggested, and unloading snow does prevent the error. I was having troubles in reproducing the initial issue, but with your last addition, road is a lot clearer. Should we add an unload for snow, or avoid switching to FORK if snow has been previously loaded?

If you say that this is silently loaded by xcms, then I would agree that we could just unloaded, but the issue is that I don't think we have a way of knowing whether either the user loaded it before or it was xcms. What do you think?

rietho commented 8 years ago

Too me using FORK on unix systems seems to be a really good idea, so I'd suggest to unload the namespace of snow right after if(nSlaves > 1) {. I'll push the changes tomorrow directly to the master branch. We also forgot to increase the version number after merging your code, which I'll fix as well. I was also thinking that it might be a task of xcms to unload snow again, after silently loading it. What do you think @sneumann?