HenrikBengtsson / future

:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
https://future.futureverse.org
956 stars 83 forks source link

Multisession futures and user interrupts #409

Open DavisVaughan opened 4 years ago

DavisVaughan commented 4 years ago

When I interrupt a multisession future by hitting Escape on my keyboard, what exactly happens?

In the following example, do the following:

For me, on the second call it never gets to the "starting future...done" message, no matter how long I wait.

Looking at Activity Monitor shows that the two R sessions are still alive, so I'm wondering what happens here. My guess is that the R sessions were supposed to send some signal back to the main R process saying that they 'finished' their work to indicate that that worker is now available for more work. The early interrupt prevents that from happening, so that worker never becomes available.

library(future)

# initialize workers
plan(multisession, workers = 2)

# The code chunk below is what gets run twice
futures <- vector("list", 2L)

for (i in 1:2) {
  print("starting future")

  futures[[i]] <- future({
    lapply(1:50, function(x) Sys.sleep(.1))
  })

  print("starting future...done")
}

value(futures)
DavisVaughan commented 4 years ago

Yea as expected in debug mode I see a lot of:

[09:34:25.403] Poll #1 (0): usedNodes() = 2, workers = 2
[09:34:26.014] Poll #2 (0.61 secs): usedNodes() = 2, workers = 2
[09:34:26.640] Poll #3 (1.24 secs): usedNodes() = 2, workers = 2
[09:34:27.251] Poll #4 (1.85 secs): usedNodes() = 2, workers = 2
[09:34:27.868] Poll #5 (2.47 secs): usedNodes() = 2, workers = 2
[09:34:28.482] Poll #6 (3.08 secs): usedNodes() = 2, workers = 2
[09:34:29.103] Poll #7 (3.7 secs): usedNodes() = 2, workers = 2
[09:34:29.731] Poll #8 (4.33 secs): usedNodes() = 2, workers = 2
[09:34:30.352] Poll #9 (4.95 secs): usedNodes() = 2, workers = 2
DavisVaughan commented 4 years ago

It looks like the main cause of this is that socketSelect() never returns TRUE for these interrupted connections. This happens deep in resolved.ClusterFuture

HenrikBengtsson commented 4 years ago

Could be related to the immediateCondition protocol on the same channel. On my to-do to robustify, if so.

Probably not related to this

HenrikBengtsson commented 4 years ago

Is this in RStudio? Escape signals SIGINT, correct? Just like Ctrl-C in the terminal. Or if there more to Escape in RStudio?

DavisVaughan commented 4 years ago

It is in RStudio. I can't seem to reproduce this in the R Console (where I also press Escape to stop the value() call). I don't know what RStudio does when I hit escape, I'll try and ask

HenrikBengtsson commented 4 years ago

Ok. Rereading - I don't think it's related to the added immediateCondition protocol. As you say, you can probably wack the PSOCK protocol out of sync too.

HenrikBengtsson commented 3 years ago

I've created Issue #438 (ROBUSTNESS: Protect against user interrupts for calls that need to be atomic) that will cover this + other things.