Closed jsilve24 closed 2 years ago
Hi there. Thanks for raising an issue but it is actually somewhat woefully short on details:
I effectively work only on Linux where things generally work. Other OSs have issues even getting OpenMP to you--the recommended way seem to change every year on macOS from what I can tell. Windows is a different beast again but we do not run configure
there.
All that said, it is of course entirely possibly that over the rewrites we dropped a ball and don't set a variable right. But here I see
So for example on my system
edd@rob:~/git/rcpparmadillo(master)$ ./configure
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether ccache g++-11 accepts -g... yes
checking how to run the C++ preprocessor... ccache g++-11 -E
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether ccache g++-11 accepts -g... (cached) yes
checking whether we have a suitable tempdir... /tmp
checking whether R CMD SHLIB can already compile programs using OpenMP... yes ## We see R already has it
checking LAPACK_LIBS... system LAPACK found
configure: creating ./config.status
config.status: creating inst/include/RcppArmadilloConfigGenerated.h
config.status: creating src/Makevars
edd@rob:~/git/rcpparmadillo(master)$
edd@rob:~/git/rcpparmadillo(master)$ grep -i openmp src/Makevars ## and we do not add it here
## Also, OpenMP support in Armadillo prefers C++11 support. However, for wider
## And with R 3.4.0, and RcppArmadillo 0.7.960.*, we turn C++11 on as OpenMP
edd@rob:~/git/rcpparmadillo(master)$
edd@rob:~/git/rcpparmadillo(master)$ grep openmp /etc/R/Makeconf ## but it really is in R's
DYLIB_LDFLAGS = -shared -fopenmp# $(CFLAGS) $(CPICFLAGS)
MAIN_LDFLAGS = -Wl,--export-dynamic -fopenmp
SHLIB_OPENMP_CFLAGS = -fopenmp
SHLIB_OPENMP_CXXFLAGS = -fopenmp
SHLIB_OPENMP_FFLAGS = -fopenmp
edd@rob:~/git/rcpparmadillo(master)$
and when I then compile this one-liner
Rcpp::cppFunction("arma::mat doubleup(arma::mat a) { return a+a; }", depends="RcppArmadillo", verbose=TRUE)
I do in fact see what I need to see:
[...]
ccache g++-11 -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include -fopenmp -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/usr/local/lib/R/site-library/RcppArmadillo/include" -I"/tmp/RtmpPMv6Rv/sourceCpp-x86_64-pc-linux-gnu-1.0.8.4" -fpic -g -O3 -Wall -pipe -pedantic -Wno-ignored-attributes -c file105530bf0498.cpp -o file105530bf0498.o
ccache g++-11 -std=gnu++11 -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto -Wl,-z,relro -o sourceCpp_2.so file105530bf0498.o -fopenmp -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib/R/lib -lR
>
which has -fopenmp
in the compile and the link step. As it should.
I am on Arch Linux using the most recent CRAN version of rcpparmadillo and R. Don't have a reproducible example.
Sent From My Mobile Device
From: Dirk Eddelbuettel @.> Sent: Tuesday, May 10, 2022 5:23:39 PM To: RcppCore/RcppArmadillo @.> Cc: Justin Silverman @.>; Author @.> Subject: Re: [RcppCore/RcppArmadillo] Possible bug in configure.ac (Issue #375)
Hi there. Thanks for raising an issue but it is actually somewhat woefully short on details:
I effectively work only on Linux where things generally work. Other OSs have issues even getting OpenMP to you--the recommended way seem to change every year on macOS from what I can tell. Windows is a different beast again but we do not run configure there.
— Reply to this email directly, view it on GitHubhttps://github.com/RcppCore/RcppArmadillo/issues/375#issuecomment-1122881240, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADOORXYAPQX4BQHAU63Q7DVJLHVXANCNFSM5VS2MPBA. You are receiving this because you authored the thread.Message ID: @.***>
Openmp, R, and gcc we installed from official Arch repos and are up to date.
Sent From My Mobile Device
From: Justin Silverman @.> Sent: Tuesday, May 10, 2022 5:36:16 PM To: RcppCore/RcppArmadillo @.>; RcppCore/RcppArmadillo @.> Cc: Author @.> Subject: Re: [RcppCore/RcppArmadillo] Possible bug in configure.ac (Issue #375)
I am on Arch Linux using the most recent CRAN version of rcpparmadillo and R. Don't have a reproducible example.
Sent From My Mobile Device
From: Dirk Eddelbuettel @.> Sent: Tuesday, May 10, 2022 5:23:39 PM To: RcppCore/RcppArmadillo @.> Cc: Justin Silverman @.>; Author @.> Subject: Re: [RcppCore/RcppArmadillo] Possible bug in configure.ac (Issue #375)
Hi there. Thanks for raising an issue but it is actually somewhat woefully short on details:
I effectively work only on Linux where things generally work. Other OSs have issues even getting OpenMP to you--the recommended way seem to change every year on macOS from what I can tell. Windows is a different beast again but we do not run configure there.
— Reply to this email directly, view it on GitHubhttps://github.com/RcppCore/RcppArmadillo/issues/375#issuecomment-1122881240, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AADOORXYAPQX4BQHAU63Q7DVJLHVXANCNFSM5VS2MPBA. You are receiving this because you authored the thread.Message ID: @.***>
Well when you are back at your computer excute the line I added in my last post:
Rcpp::cppFunction("arma::mat doubleup(arma::mat a) { return a+a; }", depends="RcppArmadillo", verbose=TRUE)
It just adds two matrices (when you call it) but that is immaterial. Setting verbose=TRUE
will show you the actual compile and link steps (as it did for me and as I showed above) which in my case contain -fopenmp
. I suspect they will for you too, but it will good to know for sure.
If you have no reproducible example, how do you know that (per your first post) "[the suspected issue is] causing parallelization to not work for me". I.e. how do you tell if parallelization is, or is not, working for you?
Sorry for the delay. I realized I was becoming "that guy" who is super unhelpful and wanted to take my time in responding.
Here is the relevant ssection of loading / building the package:
setwd('/home/jds6696/tmp/RcppArmadillo/')
> devtools::load_all('~/tmp/RcppArmadillo')
ℹ Loading RcppArmadillo
Exports from /home/jds6696/tmp/RcppArmadillo/src/RcppArmadillo.cpp:
Rcpp::IntegerVector armadillo_version(bool single)
void armadillo_set_seed_random()
void armadillo_set_seed(unsigned int val)
Exports from /home/jds6696/tmp/RcppArmadillo/src/fastLm.cpp:
List fastLm_impl(const arma::mat& X, const arma::colvec& y)
Exports from /home/jds6696/tmp/RcppArmadillo/src/test-ncores.cpp:
NumericVector get_n_threads()
/home/jds6696/tmp/RcppArmadillo/src/RcppExports.cpp updated.
/home/jds6696/tmp/RcppArmadillo/R/RcppExports.R updated.
Re-compiling RcppArmadillo
─ installing *source* package ‘RcppArmadillo’ ...
** using staged installation
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ -std=gnu++14 accepts -g... yes
checking how to run the C++ preprocessor... g++ -std=gnu++14 -E
checking whether we are using the GNU C++ compiler... (cached) yes
checking whether g++ -std=gnu++14 accepts -g... (cached) yes
checking whether we have a suitable tempdir... /tmp
checking whether R CMD SHLIB can already compile programs using OpenMP... yes
checking LAPACK_LIBS... system LAPACK found
configure: creating ./config.status
config.status: creating inst/include/RcppArmadilloConfigGenerated.h
config.status: creating src/Makevars
** libs
g++ -std=gnu++11 -I"/usr/include/R/" -DNDEBUG -I'/home/jds6696/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/usr/local/include -I../inst/include -fpic -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto=auto -UNDEBUG -Wall -pedantic -g -O0 -c RcppArmadillo.cpp -o RcppArmadillo.o
In file included from /usr/include/c++/11.2.0/x86_64-pc-linux-gnu/bits/os_defines.h:39,
I think the critical parts look like yours.
Here is the output of configure wrt src/Makevars:
## -*- mode: makefile; -*-
PKG_CXXFLAGS = -I../inst/include
PKG_LIBS= $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
## With R 3.1.0 or later, you can uncomment the following line to tell R to
## enable compilation with C++11 (where available)
##
## Also, OpenMP support in Armadillo prefers C++11 support. However, for wider
## availability of the package we do not yet enforce this here. It is however
## recommended for client packages to set it.
##
## And with R 3.4.0, and RcppArmadillo 0.7.960.*, we turn C++11 on as OpenMP
## support within Armadillo prefers / requires it
CXX_STD = CXX11
## Armadillo itself use the following define which we also set
## automatically if we see USE_CXX1X defined; outside of a package it
## may be needed explicitly
## In general, this can be enabled here via
## PKG_CXXFLAGS = -DARMA_USE_CXX11
## or via
## #define ARMA_USE_CXX11
## before RcppArmadillo.h is included
Critical thing I noticed is that the openmp_flag must have been "" because there is no $(SHLIB_OPENMP_CXXFLAGS)
anywhere.
Functionally here is my evidence of the lack of parallelization (unless I am misunderstanding something which is possible):
I forked RcppArmadillo and added a file src/test-ncores.cpp
which is as follow:
// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4; indent-tabs-mode: nil; -*-
//
// test number of cores
#include <RcppArmadillo.h>
#include <omp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void get_threads(int nthreads) {
omp_set_num_threads(nthreads);
#pragma omp parallel for
for(int i = 0; i < 5; i++){
int tid = omp_get_thread_num();
std::cout<<tid<<"\t tid"<<std::endl;
int nThreads = omp_get_num_threads();
std::cout<<nThreads<<"\t nThreads"<<std::endl;
}
}
I load the package and run this function and here is the ouput:
> get_threads(5)
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
(no parallelization).
Now if I modify configure.ac as I mentioned above (run autoreconf) then load the package, my Makevars now reads:
## -*- mode: makefile; -*-
PKG_CXXFLAGS = -I../inst/include $(SHLIB_OPENMP_CXXFLAGS)
PKG_LIBS= $(SHLIB_OPENMP_CXXFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
## With R 3.1.0 or later, you can uncomment the following line to tell R to
## enable compilation with C++11 (where available)
##
## Also, OpenMP support in Armadillo prefers C++11 support. However, for wider
## availability of the package we do not yet enforce this here. It is however
## recommended for client packages to set it.
##
## And with R 3.4.0, and RcppArmadillo 0.7.960.*, we turn C++11 on as OpenMP
## support within Armadillo prefers / requires it
CXX_STD = CXX11
## Armadillo itself use the following define which we also set
## automatically if we see USE_CXX1X defined; outside of a package it
## may be needed explicitly
## In general, this can be enabled here via
## PKG_CXXFLAGS = -DARMA_USE_CXX11
## or via
## #define ARMA_USE_CXX11
## before RcppArmadillo.h is included
and the output of that same function now is:
> get_threads(5)
0 tid
5 nThreads
4 tid
5 nThreads
2 tid
5 nThreads
3 tid
5 nThreads
1 tid
5 nThreads
Note. I think one difference between out setups is that I don't have a global system setting (e.g., /etc/R/Makeconf) that specifies to use openmp. My guess is that many users will not have that either.
Justin
Oh and here is the output you asked for:
Rcpp::cppFunction("arma::mat doubleup(arma::mat a) { return a+a; }", depends="RcppArmadillo", verbose=TRUE)
>
Generated code for function definition:
--------------------------------------------------------
// [[Rcpp::depends(RcppArmadillo)]]
#include <RcppArmadillo.h>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
arma::mat doubleup(arma::mat a) { return a+a; }
Generated extern "C" functions
--------------------------------------------------------
#include <Rcpp.h>
#ifdef RCPP_USE_GLOBAL_ROSTREAM
Rcpp::Rostream<true>& Rcpp::Rcout = Rcpp::Rcpp_cout_get();
Rcpp::Rostream<false>& Rcpp::Rcerr = Rcpp::Rcpp_cerr_get();
#endif
// doubleup
arma::mat doubleup(arma::mat a);
RcppExport SEXP sourceCpp_1_doubleup(SEXP aSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
Rcpp::traits::input_parameter< arma::mat >::type a(aSEXP);
rcpp_result_gen = Rcpp::wrap(doubleup(a));
return rcpp_result_gen;
END_RCPP
}
Generated R functions
-------------------------------------------------------
`.sourceCpp_1_DLLInfo` <- dyn.load('/tmp/RtmpYeuMNL/sourceCpp-x86_64-pc-linux-gnu-1.0.8.3/sourcecpp_4c13203aebfc/sourceCpp_2.so')
doubleup <- Rcpp:::sourceCppFunction(function(a) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_doubleup')
rm(`.sourceCpp_1_DLLInfo`)
Building shared library
--------------------------------------------------------
DIR: /tmp/RtmpYeuMNL/sourceCpp-x86_64-pc-linux-gnu-1.0.8.3/sourcecpp_4c13203aebfc
/usr/lib64/R/bin/R CMD SHLIB -o 'sourceCpp_2.so' 'file4c136fd95387.cpp'
g++ -std=gnu++11 -I"/usr/include/R/" -DNDEBUG -I../inst/include -fopenmp -I"/home/jds6696/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include" -I"/home/jds6696/R/x86_64-pc-linux-gnu-library/4.2/RcppArmadillo/include" -I"/tmp/RtmpYeuMNL/sourceCpp-x86_64-pc-linux-gnu-1.0.8.3" -I/usr/local/include -fpic -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto=auto -c file4c136fd95387.cpp -o file4c136fd95387.o
g++ -std=gnu++11 -shared -L/usr/lib64/R/lib -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto -o sourceCpp_2.so file4c136fd95387.o -fopenmp -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib64/R/lib -lR
Overall I think the logic is fairly straight forward. As configure.ac is currently written, if openmp_already_works is yes, then the openmp_flag never gets set and remains an empty string.
I am sorry but I do not have time for 5000 word three message flood submissions.
If you can demonstrate something replicably and concisely without using extraneous package (no devtools, please: stick to R CMD check or Rcpp functions) then I will reconsider.
Look at
g++ -std=gnu++11 -I"/usr/include/R/" -DNDEBUG -I../inst/include -fopenmp -I"/home/jds6696/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include" -I"/home/jds6696/R/x86_64-pc-linux-gnu-library/4.2/RcppArmadillo/include" -I"/tmp/RtmpYeuMNL/sourceCpp-x86_64-pc-linux-gnu-1.0.8.3" -I/usr/local/include -fpic -march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -Wp,-D_GLIBCXX_ASSERTIONS -flto=auto -c file4c136fd95387.cpp -o file4c136fd95387.o g++ -std=gnu++11 -shared -L/usr/lib64/R/lib -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto -o sourceCpp_2.so file4c136fd95387.o -fopenmp -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib64/R/lib -lR
You have -fopenmp
in there (and I made it bold) so you have OpenMP support.
Lastly your example works if you enable the OpenMP plugin for Rcpp so that is a different issue:
> Rcpp::sourceCpp("/tmp/rcpparma375.cpp")
> get_threads(5)
0 tid
5 nThreads
4 tid
5 nThreads
2 tid1
5 nThreads
3 tid
5 nThreads
tid
5 nThreads
>
Modified code:
#include <RcppArmadillo.h>
#include <omp.h>
using namespace Rcpp;
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
void get_threads(int nthreads) {
omp_set_num_threads(nthreads);
#pragma omp parallel for
for(int i = 0; i < 5; i++){
int tid = omp_get_thread_num();
std::cout<<tid<<"\t tid"<<std::endl;
int nThreads = omp_get_num_threads();
std::cout<<nThreads<<"\t nThreads"<<std::endl;
}
}
So first off, my appologies for comming off as "flooding".
I realize that I have -fopenmp in there, and I am not the Cpp expert you are. That said, even when I add the pluggins, I still don't get multithreading (the following was run on my system using exactly the same modifed code as you provided).
> get_threads(5)
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
That said, when I made the change to configure.ac mentioned in my original post, I get multithreading.
What it boils down to, I think, is whether Arch gets in the way. Do you have 'fopenmp' in your Makeconf? You can run the one-line expression I use here:
edd@rob:~$ grep fopenmp $(R RHOME)/etc/Makeconf
DYLIB_LDFLAGS = -shared -fopenmp# $(CFLAGS) $(CPICFLAGS)
MAIN_LDFLAGS = -Wl,--export-dynamic -fopenmp
SHLIB_OPENMP_CFLAGS = -fopenmp
SHLIB_OPENMP_CXXFLAGS = -fopenmp
SHLIB_OPENMP_FFLAGS = -fopenmp
edd@rob:~$
If you have it, then we don't need your fix.
If you don't have it, we need a fix either to the system Arch or maybe the one you suggested.
But so far I do not have a clear view into what's going on.
Actually I do it turns out:
➜ grep fopenmp $(R RHOME)/etc/Makeconf
DYLIB_LDFLAGS = -shared -fopenmp# $(CFLAGS) $(CPICFLAGS)
MAIN_LDFLAGS = -Wl,--export-dynamic -fopenmp
SHLIB_OPENMP_CFLAGS = -fopenmp
SHLIB_OPENMP_CXXFLAGS = -fopenmp
SHLIB_OPENMP_FFLAGS = -fopenmp
That's not bad, and what I suspected (as R is really rugged here and can be trusted). I suspect we have more of an issue with documentation here. You may be building your testcase wrong, no more, no less. And I don't want to sound whiny but you haven't exactly made clear how you are building your sample code and what options may or may not be set but I have the suspicion that that is the issue -- not how RcppArmadillo sets itself up.
While I don't understand what you mean by documentation. Here is a more cookbook set of steps and details that may satisfy you other point:
> RcppArmadillo:::get_threads(5)
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
0 tid
1 nThreads
So thats not expected. But I can get the expected behavior if I modify autoconfigure.ac
if test x"${openmp_already_works}" = x"yes"; then
arma_have_openmp="#define ARMA_USE_OPENMP"
openmp_flag='$(SHLIB_OPENMP_CXXFLAGS)' ## added by jds
fi
then I run autoreconf to regenerate configure script then I repeat steps 4-7 from above. Then the output is:
> RcppArmadillo:::get_threads(5)
0 tid3
5 nThreads
tid
5 nThreads
4 tid
5 nThreads
1 tid
5 nThreads
2 tid
5 nThreads
So all this said, it seems very likely to me that there is something suboptimal in configure.ac
Edit: I am not trying to make more work for you. I am the author of another package jsilve24/fido that takes motivation from the configure.ac script in RcppArmadillo. In authoring that package I noticed this bug in the configure.ac script and figured I would be nice and point it out / provide the fix I came up with for fido.
To help, I have put together a pull request that you can accept or not. I am happy to help out if that fix is not ideal but I think I am done trying to convince that the bug exists.
That is exactly the right test of adding an OpenMP using tester to the package.
In my (weak) excuse, I think the configure.ac
file was thrown into the mixer at some point (not by me) trying to figure out if one could get sanity into the ever-evolving (I am trying hard to remain polite here) "situation" known as macOS. This may have been a collateral damage we had not noticed.
(As an aside: don't just quote suggested code changes. Show them as a diff
only that gives important context.)
Re mac OS: wholehartedly agree. Actually trying to develop C++ header libraries with R bindings (thank you for Rcpp btw) with OpenMP support is actually what finally got me away from mac to linux. Actually I posted one some online forum asking for advise and I believe it was you that said something like "come to fedora" or something.
Anyways. Thanks for the advice about diff output. Will work that into future bug reports.
The key giveway, and I am once again going to harp on the world's smalled violin here, is that devtools
hides the build from you. When you just run R CMD INSTALL
or (as I do) run install.r
from littler
you see the command-line invocation and all our eyes should have been burning from lack of -fopenmp
.
And it must have been my evil twin saying that. I know nothing about Fedora, but a bit about Debian + Ubuntu. Close enough for a refugee macOS user like you :wink:
This (what I think is a) bug was causing parallelization to not work for me.
If I read it correctly, configure.ac doesn't set
can_use_openmp="yes"
ifopenmp_already_works="yes"
. As a result openmp_flag never gets set (e.g., stays an empty string) and then parallelization may not work.For me, I was able to fix this by adding the commented line