futureverse / parallelly

R package: parallelly - Enhancing the 'parallel' Package
https://parallelly.futureverse.org
130 stars 7 forks source link

NOTES: parallelly:::pid_exists() might not work on MS Windows #93

Closed HenrikBengtsson closed 1 year ago

HenrikBengtsson commented 1 year ago

I just got the following from the win-builder-{release,devel}. It suggests that on some machines, tasklist might not work;

* using log directory 'd:/RCompile/CRANguest/R-devel/parallelly.Rcheck'
* using R Under development (unstable) (2022-12-12 r83438 ucrt)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: UTF-8
* checking for file 'parallelly/DESCRIPTION' ... OK
* this is package 'parallelly' version '1.33.0'
* checking CRAN incoming feasibility ... [12s] Note_to_CRAN_maintainers
Maintainer: 'Henrik Bengtsson <henrikb@braju.com>'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking serialization versions ... OK
* checking whether package 'parallelly' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for future file timestamps ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... [0s] OK
* checking whether the package can be loaded with stated dependencies ... [0s] OK
* checking whether the package can be unloaded cleanly ... [0s] OK
* checking whether the namespace can be loaded with stated dependencies ... [0s] OK
* checking whether the namespace can be unloaded cleanly ... [0s] OK
* checking loading without being on the library search path ... [0s] OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... [8s] OK
* checking Rd files ... [1s] OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking examples ... [8s] OK
* checking for unstated dependencies in 'tests' ... OK
* checking tests ... [25s] ERROR
  Running 'as.cluster.R' [2s]
  Running 'availableCores.R' [0s]
  Running 'availableWorkers.R' [1s]
  Running 'cgroups.R' [0s]
  Running 'cpuLoad.R' [0s]
  Running 'freeCores.R' [0s]
  Running 'freePort.R' [0s]
  Running 'isConnectionValid.R' [0s]
  Running 'isForkedChild.R' [1s]
  Running 'killNode.R' [6s]
  Running 'makeClusterMPI.R' [0s]
  Running 'makeClusterPSOCK.R' [9s]
  Running 'options-and-envvars.R' [0s]
  Running 'r_bug18119.R' [0s]
  Running 'startup.R' [1s]
  Running 'utils.R' [3s]
Running the tests in 'tests/killNode.R' failed.
Complete output:
  > source("incl/start.R")
  > 
  > if (.Platform$OS.type == "windows") {
  +   killNode <- function(cl) {
  +     parallel::stopCluster(cl)
  +     rep(TRUE, times = length(cl))
  +   }
  + }
  > 
  > options(parallelly.debug = FALSE)
  > 
  > message("*** killNode() and isNodeAlive() ...")
  *** killNode() and isNodeAlive() ...
  > 
  > cl <- makeClusterPSOCK(2L, autoStop = FALSE)
  > names(cl) <- sprintf("Node %d", seq_along(cl))
  > print(cl)
  Socket cluster with 2 nodes where 2 nodes are on host 'localhost' (R Under development (unstable) (2022-12-12 r83438 ucrt), platform x86_64-w64-mingw32)
  > 
  > ## WORKAROUND: On MS Windows, each R process creates a temporary Rscript<hexcode>
  > ## file. In this test we terminate the workers such that these temporary files
  > ## are not cleaned up, which will trigger a NOTE by 'R CMD check'. Because of
  > ## this, we have to make sure to remove such files manually in this test.
  > if (.Platform$OS.type == "windows") {
  +   files <- setdiff(dir(path = tempdir(), all.files = TRUE), c(".", ".."))
  +   files <- file.path(tempdir(), files)
  +   tmpfiles <- files
  +   files <- parallel::clusterEvalQ(cl, {
  +     files <- setdiff(dir(path = tempdir(), all.files = TRUE), c(".", ".."))
  +     file.path(tempdir(), files)
  +   })
  +   files <- unlist(files)
  +   tmpfiles <- unique(c(tmpfiles, files))
  +   message(sprintf("- files: [n=%d] %s", length(tmpfiles),
  +                     paste(sQuote(tmpfiles), collapse = ", ")))
  + }
  - files: [n=3] 'D:\temp\RtmpoLi4AU/working_dir\Rtmpme443U/file28cd4276f1258', 'D:\temp\RtmpoLi4AU/working_dir\Rtmpme443U/file28cd452ba1bf4', 'D:\temp\RtmpoLi4AU/working_dir\Rtmpme443U/worker.rank=1.parallelly.parent=167124.28cd451c34a07.pid'
  > 
  > alive <- isNodeAlive(cl)
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE) :
    running command '"tasklist"' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", args = args, stdout = TRUE) :
    running command '"tasklist" /FI "PID eq 167124" /NH' had status 1
  > print(alive)
  Node 1 Node 2 
      NA     NA 
  > stopifnot(
  +   length(alive) == length(cl),
  +   is.logical(alive),
  +   !anyNA(alive),
  +   isTRUE(alive[[1]]), isTRUE(alive[[2]]),
  +   all(alive)
  + )
  Error: !anyNA(alive) is not TRUE
  Execution halted
* checking PDF version of manual ... [14s] OK
* checking HTML version of manual ... [5s] OK
* checking for detritus in the temp directory ... NOTE
Found the following files/directories:
  'Rscript2802c89539c6d' 'Rscript298d489539c5e'
* DONE
Status: 1 ERROR, 1 NOTE

Because there are no other alternatives on MS Windows, we have to relax that test on MS Windows.

HenrikBengtsson commented 1 year ago

This actually revealed two bugs in the internal pid_exists() function, which are now fixed. I also took the opportunity to make the failure less noisy by only generating one warning, instead of five each;

* using log directory 'd:/RCompile/CRANguest/R-devel/parallelly.Rcheck'
* using R Under development (unstable) (2022-12-13 r83440 ucrt)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: UTF-8
* checking for file 'parallelly/DESCRIPTION' ... OK
* this is package 'parallelly' version '1.32.1-9015'
...
  > alive <- isNodeAlive(cl)
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE, stderr = "") :
    running command '"tasklist"' had status 1
  Warning in system2("tasklist", args = args, stdout = TRUE, stderr = "") :
    running command '"tasklist" /FI "PID eq 101288" /NH' had status 1
  Warning: The 'parallelly' package is not capable of checking whether a process is alive based on its process ID, on this machine (4.3.0, platform x86_64-w64-mingw32)
  > print(alive)
  Node 1 Node 2 
      NA     NA 

The FEHLER: Ungültige Klasse output is the captured stderr from calling tasklist. I could silence it, but I think it's more helpful to keep them. It's too tedious to capture them and only output them once. Attempting to do so, would also require lots of serious testing to make sure I don't break the existing functionality, so I'll skip that.

HenrikBengtsson commented 1 year ago

I've tweaked the warning to give more troubleshooting info on the host where this fails;

* using R Under development (unstable) (2022-12-13 r83440 ucrt)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: UTF-8
* checking for file 'parallelly/DESCRIPTION' ... OK
* this is package 'parallelly' version '1.32.1-9019'
...
  > isNodeAliveSupported <- isTRUE(parallelly:::pid_exists(Sys.getpid()))
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  FEHLER: Ungültige Klasse
  Warning in system2("tasklist", stdout = TRUE, stderr = "") :
    running command '"tasklist"' had status 1
  Warning in system2("tasklist", args = args, stdout = TRUE, stderr = "") :
    running command '"tasklist" /FI "PID eq 101144" /NH' had status 1
  Warning: The 'parallelly' package is not capable of checking whether a process is alive based
  on its process ID, on this machine [R Under development (unstable) (2022-12-13 r83440 ucrt),
   platform x86_64-w64-mingw32, Windows Server x64 (build 20348), CRAN@CRANWIN2]
  > message("isNodeAlive() works: ", isNodeAliveSupported)
  isNodeAlive() works: FALSE
HenrikBengtsson commented 1 year ago

Test relaxed to be agile to the above;

https://github.com/HenrikBengtsson/parallelly/blob/e5eed15b82b38cf7be5fd6abb484577504556ead/tests/killNode.R#L45-L51