Closed rkrug closed 7 months ago
This is an interaction between {progress} and {parallel}. We use {progress} to print the progress bar, and the progress bar is stateful - I don't know the internals of {parallel}, but my suspicion is that you have a race condition with each thread updating the same progress state.
I think this should go away if you disable the progress bar, but now I also realize that oa_request()
still creates a progress object even with verbose = FALSE
. Maybe this is trivial but - @trangdata was there a reason why the progress bar's creation is outside the verbose
if-clause?
@yjunechoe you're right. oa_progress
should be inside the if clause.
Thanks for looking into this - I will try it out as soon as it is changed.
So it looks like oa_progress
is actually in some other functions outside of verbose, such as oa_ngrams
. Should we wrap it in an if (verbose){} clause @yjunechoe?
Yeah I think that'd be safest!
Unfortunately, this did not solve the issue. I installed from github It it still crashes:
r$> parallel::mclapply(1:10, function(x){oa_request(oa_query("biodiversity"), count_only = TRUE, verbose = FALSE)})
objc[9825]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[9824]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[9825]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[9824]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
Just to be sure, I used debugonce(openalexR:::oa_progress)
before running one core, and it did not go into that function. So the problem must be somewhere else.
OK - the problem is upstream in http
:
library(https)
parallel::mclapply(1:2, function(x){httr::GET("http://google.com/", path = "search")})
and it is independent of https://community.rstudio.com/t/running-parallel-on-mac/142580/6 (although I don't know if it only affect M1 Macs). I filed a bug at https://github.com/r-lib/httr/issues/749.
I do not know if the error occurs on Intel Macs, Windows or Linux - I have a M1 Mac.
It also occurs in httr2
, which superseded httr
r$> library(httr2)
req <- httr2::request("http://google.com")
parallel::mclapply(1:2, function(x){httr2::req_perform(req)})
objc[50637]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[50637]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[50638]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[50638]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
[[1]]
NULL
[[2]]
NULL
Warning message:
In parallel::mclapply(1:2, function(x) { :
scheduled cores 1, 2 did not deliver results, all values of the jobs will be affected
Hi
I am using
parallel::mclapply()
to make parallel API calls and these fail, when not a single core has been issued before:The error message is:
It might be necessary to have a OpenAlex Premium key for testing.
But if you have an idea, I would be happy to test.