rstudio / learnr

Interactive Tutorials with R Markdown
https://pkgs.rstudio.com/learnr
Apache License 2.0
710 stars 240 forks source link

Forked evaluators don't work as expected on macOS 11 with R 4.1 #641

Open gadenbuie opened 2 years ago

gadenbuie commented 2 years ago

Surfaced here — https://github.com/rstudio/learnr/runs/4874272940?check_suite_focus=true#step:7:205 — and reproducible locally.

The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
objc[72577]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[72577]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
sh: line 0: kill: (72577) - No such process
gadenbuie commented 2 years ago
pkgload::load_all()

ex <- mock_exercise("Sys.sleep(1)\n1:100", check = I("last_value"))
forked_eval_ex <- forked_evaluator_factory(evaluate_exercise(ex, new.env()), 2)

forked_eval_ex$start()
objc[73177]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[73177]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
gadenbuie commented 2 years ago

Seems like we really shouldn't be forking from macOS either, so we'd probably want to use something other than parallel::mcparallel(). I'm going to skip testing the forked evaluator on Mac; for now the feature is intended for Unix only.

gadenbuie commented 2 years ago

Yeah, *forked* parallelization on macOS #rstats is not stable. I\'ve got a pkg test using parallel::makeCluster(type=\"FORK\") that ends with:

source("incl/end.R")

but the GitHub Action logs show it runs beyond that using stray code (that's not forked)

--- Henrik Bengtsson (\@henrikbengtsson) May 15, 2020

HenrikBengtsson commented 2 years ago

Yeah, forked parallelization on macOS #rstats is not stable. I've got a pkg test using parallel::makeCluster(type="FORK") that ends with:

source("incl/end.R")

but the GitHub Action logs show it runs beyond that using stray code (that's not forked) --- Henrik Bengtsson (@HenrikBengtsson) May 15, 2020

So, that particular problem was because I used a too harsh method to test what happens when a worker get killed (the purpose of my test script). I used quit() inside the worker, which turns out to be a major no-no and wreaks havoc in forked R processing. I have since switched to tools::pskill(Sys.getpid()) and I no longer have the issue quoted.

Having said that, yes, forked processing in R can be finicky. Strictly speaking, it's only safe to use if you are in 100% control of the full code, including any package dependencies. Since you cannot control package dependencies, forked parallelization code that has been validated but break suddenly if a dependency introduces something that is not fork safe. So, multicore/forked parallelization can only be one alternative among several. Unfortunately.

gadenbuie commented 2 years ago

ooh thanks @HenrikBengtsson -- I didn't intend to tag you into this issue 🙈 but I appreciate your insight! I think in this case there's something about the code we're evaluating in the forked process that's causing the issue. The good news is that we're going to move toward a process based on future (via promises), so thanks also for the work you do there!