HenrikBengtsson / parallelly

R package: parallelly - Enhancing the 'parallel' Package
https://parallelly.futureverse.org
130 stars 7 forks source link

plan(multisession) leaks connections #56

Closed tzakharko closed 3 years ago

tzakharko commented 3 years ago

Describe the bug

Using plan(multisession) generates warnings about unused connections. This is so at least on macOS with R 4.1.0. I did not have the chance to test other platforms.

Reproduce example

future::plan(future::multisession)
gc()
# In .Internal(gc(verbose, reset, full)) :
#  closing unused connection 3 (localhost)

Session information

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.1.0    parallelly_1.26.1 tools_4.1.0       parallel_4.1.0   
[5] listenv_0.8.0     codetools_0.2-18  digest_0.6.27     globals_0.14.0   
[9] future_1.21.0    
HenrikBengtsson commented 3 years ago

Thank you for reporting this. I can reproduce this in a vanilla R session:

$ R --vanilla
R version 4.1.0 Patched (2021-06-26 r80566) -- "Camp Pontanezen"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...
> future::plan(future::multisession, workers = 2)
> dummy <- gc()
Warning message:
In .Internal(gc(verbose, reset, full)) :
  closing unused connection 3 (localhost)

> future::plan(future::multisession, workers = 2)
> dummy <- gc()
Warning message:
In .Internal(gc(verbose, reset, full)) :
  closing unused connection 3 (localhost)
>

Since 'multisession' uses parallelly::makeClusterPSOCK(), here's a reproducible example without using the future package:

$ R --vanilla
R version 4.1.0 Patched (2021-06-26 r80566) -- "Camp Pontanezen"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...
> cl <- parallelly::makeClusterPSOCK(1)
> dummy <- gc()
Warning message:
In .Internal(gc(verbose, reset, full)) :
  closing unused connection 3 (localhost)
>

Troubleshooting

## stdin=0, stdout=1, stderr=2
> getAllConnections()
[1] 0 1 2

> cl <- parallelly::makeClusterPSOCK(1)
> getAllConnections()
[1] 0 1 2 3 4

## the connection to the single worker has index 4
> as.integer(cl[[1]]$con)
[1] 4

## connection with index HenrikBengtsson/future#3 is the stray connect
> dummy <- gc()
Warning message:
In .Internal(gc(verbose, reset, full)) :
  closing unused connection 3 (localhost)

FWIW, parallel::makePSOCKcluster(1) does not have this problem.

Workaround

This stray connection comes from the new setup_strategy = "parallel" in parallelly (>= 1.26.0). One way to avoid it, is to use setup_strategy = "sequential":

> getAllConnections()
[1] 0 1 2
> cl <- parallelly::makeClusterPSOCK(1, setup_strategy = "sequential")
> getAllConnections()
[1] 0 1 2 3
> dummy <- gc()
> 

This can also be set via an option:

options(parallelly.makeNodePSOCK.setup_strategy = "sequential")

or the corresponding environment variable, cf. help("parallelly.options", package="parallelly").

This also works with plan(multisession);

> future::plan(future::multisession, workers = 2, setup_strategy = "sequential")
> getAllConnections()
[1] 0 1 2 3 4
> dummy <- gc()
> 
HenrikBengtsson commented 3 years ago

Since this is a bug in parallelly, I'll transfer this issue over to that repo.

HenrikBengtsson commented 3 years ago

Fixed in the develop version of parallelly, cf. commit 67836def. To install that version, use:

remotes::install_github("HenrikBengtsson/parallelly", ref="develop")

Details: The problem was that the socket connection was indeed set up to be cleaned up automatically when exiting makeClusterPSOCK();

https://github.com/HenrikBengtsson/parallelly/blob/67836defe0bee750d9629803e9b40fc55371a839/R/makeClusterPSOCK.R#L200

However, an older strategy would override this at the very end:

https://github.com/HenrikBengtsson/parallelly/blob/67836defe0bee750d9629803e9b40fc55371a839/R/makeClusterPSOCK.R#L350-L351

tzakharko commented 3 years ago

Thanks for the quick fix! Can confirm that the warning is gone. As far as I am concerned, this issue can be closed.