RcppCore / RcppArmadillo

Rcpp integration for the Armadillo templated linear algebra library
193 stars 56 forks source link

LTO mismatch with `dgemm` with RcppArmadillo and R core definitions #343

Closed mattfidler closed 3 years ago

mattfidler commented 3 years ago

Hi @eddelbuettel

I think there may be a RcppArmadillo LTO issue that is described by the following log:

https://www.stats.ox.ac.uk/pub/bdr/LTO/RxODE.out

To save you time in parsing, it boils down to the following:

/data/gannet/ripley/R/R-devel/include/R_ext/BLAS.h:207:1: warning: type of ‘dgemm_’ does not match original declaration [-Wlto-type-mismatch]
  207 | F77_NAME(dgemm)(const char *transa, const char *transb, const int *m,
      | ^
/data/gannet/ripley/R/test-4.2/RcppArmadillo/include/armadillo_bits/def_blas.hpp:115:8: note: type mismatch in parameter 14
  115 |   void arma_fortran(arma_dgemm)(const char* transA, const char* transB, const blas_int* m, const blas_int* n, const blas_int* k, const double*   alpha, const double*   A, const blas_int* ldA, const double*   B, const blas_int* ldB, const double*   beta, double*   C, const blas_int* ldC, blas_len transA_len, blas_len transB_len) ARMA_NOEXCEPT;
      |        ^
/data/gannet/ripley/R/test-4.2/RcppArmadillo/include/armadillo_bits/def_blas.hpp:115:8: note: type ‘blas_len’ should match type ‘void’
/data/gannet/ripley/R/test-4.2/RcppArmadillo/include/armadillo_bits/def_blas.hpp:115:8: note: ‘dgemm_’ was previously declared here

My guess is the LTO compile options have changed and become more restrictive (ie flto=10). I did (with your help) also change the headers to strict R headers, though I think that is unrelated

Perhaps my reading of the error message is incorrect. I thought I would bring it to your attention.

While I do not wish to, I may have to move and modify the armadillo headers (because of the fix by 2021-09-02 CRAN deadline).

eddelbuettel commented 3 years ago

Hm, I have yet to turn on LTO on the Debian (and Ubuntu) side of things, i.e. I have not enabled it for R, or Rcpp, or RcppArmadillo. So maybe it is something triggered by Prof Ripley on his side.

That said, it may be reproducible, and @conradsnicta is generally very receptive to clear and concise examples. So could I ask you to maybe try to cook up a (non-R, standalone) example program that with/without LTO options shows the difference? I probably have dgemm callers for BLAS here somewhere though we probably want to tickle it via RcppArmadillo.

eddelbuettel commented 3 years ago

Took another look at your full report here (reference from the standard results page here) and its a delicate issue as R (and LTO) look back to the basic C linkage yet Armadillo has a different (stacked, via arma_fortran() calling arma_dgemm()) setup. I am sure that will come up again. Hm. I may need to consult with @conradsnicta and see what we can do here medium-term as LTO will surely become more common and standard...

mattfidler commented 3 years ago

For a quick fix you could inline the blas routines and rename them; I don't think it is an ideal solution, but it worked for some of the matrix exponential calls used in the RxODE ode solving

https://github.com/nlmixrdevelopment/RxODE/blob/346edfe7be78c8998eccbb51b89a36929de84488/src/matexp.f#L12

This is not ideal it means you cant choose the blas that you want to do calculations.

When reading through the C code myself, I noticed the number of arguments for dgemm may be different in R than in standard blas. I think they used this trick as a safety check to get past the problems passing strings between C and fortran that we saw in #254, but I'm unsure their motivation

eddelbuettel commented 3 years ago

Quick fixes sometimes bite your backside in the medium / longer-term. A minimal yet reproducible example (outside of R) would be a good first step in getting towards this.

mattfidler commented 3 years ago

Hm. I could likely do this inside R but I dont think I can outside R. The blas R and standard blas definitions are clashing. I will try a simple R example first.

eddelbuettel commented 3 years ago

An R-only example would still help greatly. I can then either try to work from that or discuss it with @conradsnicta .

eddelbuettel commented 3 years ago

The other thing to do (for at least, while we work through this) is that you could add a -fno-lto flag.

mattfidler commented 3 years ago

Well I found a work around for the package for now. Interestingly enough, this came because of a code refactoring that actually didn't directly involve armadillo (found from a bisection)

https://github.com/nlmixrdevelopment/RxODE/commit/52294dcc98e45681ee361c0d1942ff165ece88f0

Alas however, I am unsure how this came up. I believe there is some bug somewhere, but since I can't figure out how to create it in a small reproducible example, and I know how to fix it for now, feel free to close this bug (unless you want to try to figure it out)

eddelbuettel commented 3 years ago

Yeah happy to close if that is ok with you.

Sometime strange things happen just with header reordering and alike -- maybe something was shadowing something elsewhere. It does smell a little local to your repo which is why closing may be wisest. I still have (optional?) conversion to lto for the R installation as a TODO so maybe we get to revisit this...

And by all means let's reopen if there is something new to review.

mattfidler commented 3 years ago

Thanks Dirk

mattfidler commented 3 years ago

You are right @conradsnicta,

The one thing I don't understand is why it thinks it needs to be void. To me, that means that is somehow being mucked up somewhere.

That could definitely cause the LTO errors that were observed.

At the same time, it doesn't always occur and I'm unsure what is causing it here.