apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.23k stars 3.47k forks source link

[R] Re-allow some multithreading on Windows #29747

Open asfimport opened 2 years ago

asfimport commented 2 years ago

Followup to ARROW-8379, which set use_threads = FALSE on Windows. See discussion about adding more controls, disabling threading in some places and not others, etc. We want to do this soon after release so that we have a few months to see how things behave on CI before releasing again.


Collecting some CI hangs after ARROW-8379

  1. Rtools35, 64bit test suite hangs:

https://github.com/apache/arrow/pull/11290/checks?check_run_id=3767787034


** running tests for arch 'i386' ...
  Running 'testthat.R' [17s]
 OK
** running tests for arch 'x64' ...

Error: Error: <rlib_error_2_0 in process_get_error_connection(self, private):
 stderr is not a pipe.>

Reporter: Neal Richardson / @nealrichardson

Note: This issue was originally created as ARROW-14159. Please see the migration documentation for further details.

asfimport commented 2 years ago

Jonathan Keane / @jonkeane: One way to do this would be to add something like the below to the file r/tests/testthat/helper-arrow.R (which is where we had previously disabled those until ARROW-8379 was merged.


if (tolower(Sys.info()[["sysname"]]) == "windows") {
   options(arrow.use_threads = TRUE)
 }
asfimport commented 2 years ago

Neal Richardson / @nealrichardson: I think what I would do is replace the .onLoad block with something that sets the cpu_count and io_thread_count to some fraction of Ncpus, so we'd have some parallelism but hopefully in a safe way.

asfimport commented 2 years ago

Jonathan Keane / @jonkeane: That also works — for testing in CI, I would recommend using the max cpu_count and io_thread_count we can to have the highest chance of running into deadlocks (unless we conclusively prove that having fewer eliminates the deadlocks totally).

asfimport commented 2 years ago

Dewey Dunnington / @paleolimbot: Another failure with a similar failure is here: https://github.com/apache/arrow/runs/4084014846?check_suite_focus=true#step:17:667

asfimport commented 2 years ago

Sam Albers / @boshek: Though I don't have time right now to distill into a minimal reprex (in fact it is sort of the opposite; sorry!), the targets pipeline in this repo reliably (at least on my machine) experiences an issue with multithreading on Windows to extent that I did have to add this in to stop it from simply hanging on Windows. I'm not sure how helpful this is but up to this point I have to reliably reproduce this issue.