Bug: Executing `http::GET()` in parallel results in an error when no single core GET request was issued before. #749

Closed rkrug closed 9 months ago

rkrug commented 9 months ago

macOS Sonoma, MacBook Pro, M1 Pro chip

Trying to use parallel::mclapply() to do GET requests in parallel, results in errors on all cores.

After executing a single core request once, results in the error disappearing.

r$> library(httr)

r$> parallel::mclapply(1:2, function(x){httr::GET("http://openalex.org/")})
objc[45797]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[45797]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
objc[45796]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[45796]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.


Warning message:
In parallel::mclapply(1:2, function(x) { :
  scheduled cores 1, 2 did not deliver results, all values of the jobs will be affected

r$> httr::GET("http://openalex.org/")
Response [https://openalex.org/]
  Date: 2023-11-12 11:16
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB

r$> parallel::mclapply(1:2, function(x){httr::GET("http://openalex.org/")})
Response [https://openalex.org/]
  Date: 2023-11-12 11:17
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB

Response [https://openalex.org/]
  Date: 2023-11-12 11:17
  Status: 200
  Content-Type: text/html; charset=UTF-8
  Size: 1.02 kB

r$> sessioninfo::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Sonoma 14.1
 system   aarch64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/Zurich
 date     2023-11-12
 pandoc   3.1.9 @ /opt/homebrew/bin/pandoc

rkrug commented 9 months ago

Based on among others https://community.rstudio.com/t/running-parallel-on-mac/142580/6, I have set OBJC_DISABLE_INITIALIZE_FORK_SAFETY in environ to YES:

[1] "YES"

But no change.

hadley commented 9 months ago

httr has been superseded by httr2, so no further development work will happen. I'd recommend giving httr2::req_perform_parallel() a go since it does parallel requests in a way that actually works (i.e. using curl's parallel request facilities).

rkrug commented 9 months ago

Thanks - I'll look into httr2. Although httr2::req_perform_parallel() is unfortunately an option, as the call ia=s in a package.

hadley commented 9 months ago

Why isn't it an option?

rkrug commented 9 months ago

It is not my package....

rkrug commented 9 months ago

And also, parallel calls can cause problems due to API restrictions of that specific api - so it needs to be handled with care.

hadley commented 9 months ago

Oh got it.