HenrikBengtsson / future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
https://future.futureverse.org
946 stars 82 forks source link

Error in unserialize: MultisessionFuture (<none>) failed to receive results from cluster RichSOCKnode #1 (PID 235 on ‘localhost’) #569

Closed can-taslicukur closed 2 years ago

can-taslicukur commented 2 years ago

Hi i am using promises with future in my shiny app to make async queries to MySQL database. This is the code that i use to send asynchronous query to database

library(promises)
library(future)
plan(multisession)

dat <- future_promise({
    con <- dbConnect(MySQL(),
                     user = DB_USER, password = DB_PASSWORD,
                     host = DB_HOST, port = DB_PORT,
                     dbname = DB_NAME, encoding = "latin1")
    dbSendQuery(con, "SET NAMES utf8mb4;")
    dbSendQuery(con, "SET CHARACTER SET utf8mb4;")
    dbSendQuery(con, "SET character_set_connection=utf8mb4;")
    dat <- dbGetQuery(con,query$df_sql)
    dbDisconnect(con)
    return(dat)
})

### after some time
dat

However when i run this code, I randomly get the following error

Warning: Error in unserialize: MultisessionFuture (<none>) failed to receive results from cluster RichSOCKnode #1 (PID 235 on ‘localhost’). The reason reported was ‘error reading from connection’. Post-mortem diagnostic: The total size of the 6 globals exported is 74.26 KiB. The three largest globals are ‘query’ (18.91 KiB of class ‘list’) and ‘dbSendQuery’ (10.30 KiB of class ‘function’)

Why do I get this error at random? Sometimes it works exactly like I wanted but sometimes i get this weird error. How can i fix it?

Session Info:

R version 4.1.1 (2021-08-10)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.0.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.1.1 DBI_1.1.1      tools_4.1.1   
HenrikBengtsson commented 2 years ago

Thanks for reporting. It's hard to say why this happens - to me, the code looks valid and there are no obvious pitfalls as far as I can tell.

FWIW, out of the millions(!) multisession futures, I've created over time, I might have ran into this problem maybe once or twice. What is more common is when one exhaust the worker so it crashes, but then the error message says that the worker is no longer alive - that's not the problem in your case.

The only thing I can imagine right now is that the returned results are really large, and that somehow causes problems for R's serialize/unserialize. What range of objectSize(dat) do you expect here?

To see this, or if this happens for specific SQL query (don't see why though), you could try to write debug info to a file. (We can't use cat() and message() here, because those won't be relayed due to this "infrastructure" error.). Something like what I've added to the example at the very end.

Another immediate thing you can do is to try with another, completely different, parallel backend, e.g.

plan(future.callr::callr)

That transfers data back and forth between the main R session and the workers differently from multisession (which rely on parallel's PSOCK framework, where this unserialize error comes from). The callr backend has more overhead, but if you cannot reproduce this problem with it, then that would help point toward stability issues with the PSOCK-based backend.

Since you're on macOS, you could of course also try with plan(multicore). Again, the purpose is to figure out if this is backend specific or not.

Beyond that, I'd suggest you try to reproduce the issue without using shiny and promises. For example, you could stress test this with something like:

library(future)
plan(multisession)

log <- function(..., logfile = "troubleshoot.log") {
  msg <- sprintf(...)
  cat(msg, "\n", sep = "", file = logfile, append = TRUE)
}

query <- "<a typical SQL query>"

repeat({
  f <- future({
      log("Connecting to DB")
      con <- dbConnect(MySQL(),
                       user = DB_USER, password = DB_PASSWORD,
                       host = DB_HOST, port = DB_PORT,
                       dbname = DB_NAME, encoding = "latin1")
      log("Initiate DB query")
      dbSendQuery(con, "SET NAMES utf8mb4;")
      dbSendQuery(con, "SET CHARACTER SET utf8mb4;")
      dbSendQuery(con, "SET character_set_connection=utf8mb4;")
      log("Querying DB: %s", sQuote(query))
      dat <- dbGetQuery(con, query)
      log("Disconnecting from DB")
      dbDisconnect(con)
      log("Returning results of size: %g bytes", objectSize(dat))
      return(dat)
  })
  v <- value(f)
  str(v)
})

If you can reproduce it this way, that would also help narrow in on the problem.

PS. Your session info lists neither future or promises. PS 2. I'll migrate this to 'Discussions', because at the moment I don't think it's a bug in the future framework per se.