apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
14.39k stars 3.5k forks source link

[Dev] r_valgrind image doesn't use full parallelism #28749

Open asfimport opened 3 years ago

asfimport commented 3 years ago

I have a 12-core 24-thread CPU,  but the ubuntu_r_valgrind Docker image only seems to be using 4 threads when I build Arrow:


     35 pts/0    S+     0:00 /bin/sh /usr/local/RDvalgrind/lib/R/bin/INSTALL /arrow/r
     39 pts/0    S+     0:00  \_ /usr/local/RDvalgrind/lib/R/bin/exec/R --no-save --no-restore --no-restore --no-echo --args nextArg/arrow/r
     55 pts/0    S+     0:00      \_ sh -c _R_SHLIB_BUILD_OBJECTS_SYMBOL_TABLES_=false  ./configure 
     56 pts/0    S+     0:00          \_ /bin/sh ./configure
     86 pts/0    S+     0:00              \_ /usr/local/RDvalgrind/lib/R/bin/exec/R --no-save --no-restore --no-echo --no-restore --file=tools/nixlibs.R --args 4.0.1.900
    205 pts/0    S+     0:00                  \_ sh -c SOURCE_DIR="../cpp" BUILD_DIR="/tmp/Rtmpz9DiSI/file561e1b1cec" DEST_DIR="libarrow/arrow-4.0.1.9000" CMAKE="/tmp/Rt
    206 pts/0    S+     0:00                      \_ /bin/bash inst/build_arrow_static.sh
    388 pts/0    S+     0:00                          \_ /tmp/Rtmpz9DiSI/file565b17078b/cmake-3.19.2-Linux-x86_64/bin/cmake --build . --target install
    389 pts/0    S+     0:00                              \_ /bin/make install
    392 pts/0    S+     0:00                                  \_ make -s -f CMakeFiles/Makefile2 all
   2571 pts/0    S+     0:00                                      \_ make -s -f src/arrow/CMakeFiles/arrow_objlib.dir/build.make src/arrow/CMakeFiles/arrow_objlib.dir/bu
   2622 pts/0    S+     0:00                                          \_ /bin/sh -c cd /tmp/Rtmpz9DiSI/file561e1b1cec/src/arrow && /bin/g++  -std=gnu++11 -DARROW_EXPORTI
   2623 pts/0    S+     0:00                                          |   \_ /bin/g++ -std=gnu++11 -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_BMI2 
   2624 pts/0    R+     0:28                                          |       \_ /usr/lib/gcc/x86_64-linux-gnu/10/cc1plus -quiet -I /tmp/Rtmpz9DiSI/file561e1b1cec/src -I
   2628 pts/0    S+     0:00                                          \_ /bin/sh -c cd /tmp/Rtmpz9DiSI/file561e1b1cec/src/arrow && /bin/g++  -std=gnu++11 -DARROW_EXPORTI
   2629 pts/0    S+     0:00                                          |   \_ /bin/g++ -std=gnu++11 -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_BMI2 
   2646 pts/0    R+     0:00                                          |       \_ as -I /tmp/Rtmpz9DiSI/file561e1b1cec/src -I /arrow/cpp/src -I /arrow/cpp/src/generated -
   2635 pts/0    S+     0:00                                          \_ /bin/sh -c cd /tmp/Rtmpz9DiSI/file561e1b1cec/src/arrow && /bin/g++  -std=gnu++11 -DARROW_EXPORTI
   2636 pts/0    S+     0:00                                          |   \_ /bin/g++ -std=gnu++11 -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_BMI2 
   2637 pts/0    R+     0:07                                          |       \_ /usr/lib/gcc/x86_64-linux-gnu/10/cc1plus -quiet -I /tmp/Rtmpz9DiSI/file561e1b1cec/src -I
   2642 pts/0    S+     0:00                                          \_ /bin/sh -c cd /tmp/Rtmpz9DiSI/file561e1b1cec/src/arrow && /bin/g++  -std=gnu++11 -DARROW_EXPORTI
   2643 pts/0    S+     0:00                                              \_ /bin/g++ -std=gnu++11 -DARROW_EXPORTING -DARROW_HAVE_RUNTIME_AVX2 -DARROW_HAVE_RUNTIME_BMI2 
   2644 pts/0    R+     0:02                                                  \_ /usr/lib/gcc/x86_64-linux-gnu/10/cc1plus -quiet -I /tmp/Rtmpz9DiSI/file561e1b1cec/src -I

Reporter: Antoine Pitrou / @pitrou

Note: This issue was originally created as ARROW-13038. Please see the migration documentation for further details.

asfimport commented 3 years ago

Antoine Pitrou / @pitrou: @jonkeane

asfimport commented 3 years ago

Jonathan Keane / @jonkeane: It looks like this is part of the upstream image:

https://github.com/wch/r-debug/blob/master/r-devel/buildR.sh#L108

We could send a PR removing that (or uses $(nproc) like is used 13 lines above make --jobs=$(nproc)

asfimport commented 3 years ago

Jonathan Keane / @jonkeane: Git blame didn't show an obvious "we need to limit this because X broke"

asfimport commented 3 years ago

Antoine Pitrou / @pitrou: If this can be fixed upstream then great :)