rstudio / promises

A promise library for R
https://rstudio.github.io/promises
Other
197 stars 19 forks source link

promise_all not Executing as Expected in Rscript or R -e Invocation #105

Closed clbenoit closed 2 months ago

clbenoit commented 3 months ago

Hi,

My code behaves differently depending on whether I'm running it from an interactive R session, or using the R -e "" and Rscript commands. I define the following function :

#' @export
buildDB_sarek <- function() {

    library(future)
    library(promises)
    csq_promise <- future_promise({
        csq.vcf <- data.frame("a" = "a")
      message("I am inside csq_promise")
        return(csq.vcf)
    })
    
    info_promise <- future_promise({
       info.vcf <- data.frame("a" = "a")
      message("I am inside info_promise")
       return(info.vcf)
    })
    
    geno_promise <- future_promise({
        geno.vcf <- data.frame("a" = "a")
        message("I am inside geno_promise")
       return(geno.vcf)
    })
    
    promise_all(geno_promise, csq_promise, info_promise) %...>% {
      geno.vcf <- environment(geno_promise[["then"]])[["private"]][["value"]]
      info.vcf <- environment(info_promise[["then"]])[["private"]][["value"]]
      csq.vcf <- environment(csq_promise[["then"]])[["private"]][["value"]]
      message("I am inside promise_all")
      print(head(geno.vcf))
      print(head(info.vcf))
      print(head(csq.vcf))
    }
}
buildDB_sarek()

Calling buildDB_sarek() behaves as expected when called within R interactive session. However, when using R -e “buildDB_sarek()” or Rscript code.R, the promise_all step is never triggered. I am certain that .libPaths(), getwd(), buildDB_sarek, and the R version are exactly the same ...

sessionInfo :

R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] future_1.33.2  promises_1.2.1

loaded via a namespace (and not attached):
 [1] compiler_4.2.1    parallelly_1.37.1 magrittr_2.0.3    R6_2.5.1
 [5] cli_3.6.2         later_1.3.2       tools_4.2.1       parallel_4.2.1
 [9] listenv_0.9.1     Rcpp_1.0.12       codetools_0.2-18  digest_0.6.35
[13] globals_0.16.3    rlang_1.1.3
clbenoit commented 3 months ago

I tried to define various strategies into the function. Still observing the same issue 😢 Did I miss something basic ?

wch commented 3 months ago

I think the problem is that promise_all() returns a promise, but once the R process finishes running the function, it doesn't know that it should wait until the promise is resolved before exiting.

Here's a function, block_until_settled() that will block until the promise is resolved or rejected.

library(later)
library(promises)

# This function creates a promise that takes a few seconds to resolve
make_promise <- function() {
  promise(function(resolve, reject) {
    cat("Entering promise creation\n")
    later(function() {
      cat("Resolving promise now\n")
      resolve("Done")
    }, 3)
    cat("Exiting promise creation\n")
  })
}

# Block until a promise is resolved or rejected
block_until_settled <- function(p) {
  promise_resolved <- FALSE
  p$finally(function(value) {
    promise_resolved <<- TRUE
  })

  while(!promise_resolved) {
    later::run_now(1)
  }
}

p <- make_promise()

block_until_settled(p)
clbenoit commented 3 months ago

It works !

#' @export
buildDB_sarek <- function() {

  library(future)
  library(promises)
  csq_promise <- future_promise({
    csq.vcf <- data.frame("a" = "a")
    message("I am inside csq_promise")
    return(csq.vcf)
  })

  info_promise <- future_promise({
    info.vcf <- data.frame("a" = "a")
    message("I am inside info_promise")
    return(info.vcf)
  })

  geno_promise <- future_promise({
    geno.vcf <- data.frame("a" = "a")
    message("I am inside geno_promise")
    return(geno.vcf)
  })

  promise_all <- promise_all(geno_promise, csq_promise, info_promise) %...>% {
    geno.vcf <- environment(geno_promise[["then"]])[["private"]][["value"]]
    info.vcf <- environment(info_promise[["then"]])[["private"]][["value"]]
    csq.vcf <- environment(csq_promise[["then"]])[["private"]][["value"]]
    message("I am inside promise_all")
    print(head(geno.vcf))
    print(head(info.vcf))
    print(head(csq.vcf))
  }

  block_until_settled <- function(p) {
    promise_resolved <- FALSE
    p$finally(function(value) {
      promise_resolved <<- TRUE
    })

    while(!promise_resolved) {
      later::run_now(1)
    }
  }

  print(promise_all)

  block_until_settled(promise_all)

}
buildDB_sarek()

Thanks a lot for the trick, you saved me 😄

wch commented 3 months ago

Oh, one more thing I just thought of: if needed, you can also add a timeout like this:

library(later)
library(promises)

# This function creates a promise that takes a few seconds to resolve
make_promise <- function() {
  promise(function(resolve, reject) {
    cat("Entering promise creation\n")
    later(function() {
      cat("Resolving promise now\n")
      resolve("Done")
    }, 3)
    cat("Exiting promise creation\n")
  })
}

# Block until a promise is resolved or rejected, or timeout has elapsed.
# Note that when running in an interactive session, the event loop will keep
# running even after this function returns, so a timeout will not stop the
# promise from running; it will just allow this function to return instead of
# continuing to block.
block_until_settled <- function(p, timeout = 10) {
  start_time <- as.numeric(Sys.time())
  promise_resolved <- FALSE
  p$finally(function(value) {
    promise_resolved <<- TRUE
  })

  while(!promise_resolved) {
    if (Sys.time() > start_time + timeout) {
      stop("Timed out!")
    }
    later::run_now(0.5)
  }
}

p <- make_promise()

block_until_settled(p, timeout=2)

Note that if you run it from an interactive session, it will print the time out message, and then the promises will keep running (when the console is sitting idle) and still resolve later.

block_until_settled(p, timeout=2)
#> Error in block_until_settled(p, timeout = 2) : Timed out
#> Resolving promise now

But if you are running it via Rscript, then it will exit the process after the function stops blocking:

❯ Rscript test.R
Warning message:
Entering promise creation
Exiting promise creation
Error in block_until_settled(p, timeout = 2) : Timed out!
Execution halted
clbenoit commented 2 months ago

:+1: Nice, it will be definitely useful to avoid ghost processes on cluster, thanks a lot !

jcheng5 commented 2 months ago

Please NEVER call block_until_settled (or later::run_now, or anything else that directly or indirectly calls later::run_now) from inside of a Shiny app or Plumber app--it's extremely dangerous, as those frameworks assume they're the only ones who are calling it. Other than that, 👍 to this solution.