Closed rubenarslan closed 6 years ago
Hi. Hard to say what's going on without a reproducible example and other details. For instance, it may be that the master R process is busy retrieving a large amount of data from one of the R workers. See also if you can reproduce this in a plain R session outside of RStudio - could be an RStudio thing.
... and yes, resolved()
should be non-blocking and return momentarily with either FALSE
or TRUE
.
Can you help me make reproducible examples for these problems? Obviously, it's a bit harder since it involves private servers... I don't know what you'd need to know about my server's setups etc.
Like I said, this doesn't always happen.
Just now, I tried to send a job to one computation server (without an error message, no R process was ever spawned when viewed via top on the server). When I tried accessing the future's value, R hung and I had to force-quit. It seems unlikely that a huge amount of data was being transmitted, since this was right after job submission and the data isn't huge. After this, I restarted R.
Trying to call login <- tweak(remote, workers = rep("arslan@arc-srv-cpt7.mpib-berlin.mpg.de", 1), persistent = FALSE); plan(login)
led to this error message
Error: Internal error: Unexpected result retrieved for ClusterFuture future (‘
’): ‘NA’
The second server is from what I can tell identical in setup to another, for which it worked immediately. I can ssh in and run R on both, same R version etc, except the first one (with the error) runs Ubuntu 16, the other 14.
Before anything else, for:
Error: Internal error: Unexpected result retrieved for ClusterFuture future (‘’): ‘NA’
see Issue #215
Ok, so this was a version mismatch (1.7.0 vs. 1.8.0). I'm not 100% sure that the new version was already loaded before the restart after the hang. I'll see if it recurs.
Thanks for the follow up. Yes, running with new future 1.8.0 on master and future (< 1.8.0) on workers will cause problems and non-informative error messages like what you've seen. It could also be that it explains the silent "stalls" you're observing. Not detecting that future is not installed or is outdated on the workers is an oversight by me in the future 1.8.0 release - I'll improve this in the next release (Issue #216).
UPDATE: I stumbled upon a similar "non-responsiveness" in futures that occurred when the worker didn't have the future package installed. I could reproduce it, I added a test, and fixed it in commit 634729e. Later I added code to detect when the future package was missing resulting in an early and nicer error being signaled. Both layers of protection helps avoid this non-responsiveness of workers.
Hopefully, these updates helps in your case as well. To test the new code, use:
remotes::install_github('HenrikBengtsson/future@develop')
I'm closing this issue, but please feel free to reopen if the new code is not helping.
I frequently find myself having to restart an Rstudio session, because
resolved
orvalue
don't return (don't seem to ever return). I understand whyvalue
does not return if the future isn't finished, but I thoughtresolved
would always return immediately. Once this happens, Esc and the stop button in Rstudio don't help, only force-quitting R. I haven't yet been able to isolate why this works sometimes and not others. These are usually implicit futures with a nested topology:list(remote, multicore)
.Is this a known problem or would a reproducible example help?