RcppCore / RcppArmadillo

Rcpp integration for the Armadillo templated linear algebra library
193 stars 56 forks source link

is ARMA_CRIPPLED_LAPACK still required? #456

Closed conradsnicta closed 1 week ago

conradsnicta commented 1 week ago

@eddelbuettel @coatless Upcoming changes to a new Armadillo release require a large bunch of additional LAPACK functions, including zhetrf, zhetri, zhecon.

Since R developers persist in shipping a subset of LAPACK, I checked whether these functions are available in their so-called Rlapack.

Turns out these LAPACK functions appear to be available, as of R 4.4.0. It also turns out that a bunch of LAPACK functions previously "missing" have been added to Rlapack as of R 4.2.2. (See extract from R-4.4.1/src/modules/lapack/README below).

This raises the question: is the ARMA_CRIPPLED_LAPACK workaround inside Armadillo still required?

Removing the workaround would simplify a lot of code inside Armadillo. It would also allow speedups that were previously disabled under Rlapack. A possible downside is that RcppArmadillo would then require at least R 4.4.0.

To make sure that ARMA_CRIPPLED_LAPACK is no longer required, could you test RcppArmadillo against Rlapack without ARMA_CRIPPLED_LAPACK being set? (I don't have the bandwidth nor a windoze machine to do this)


Looking at R-4.4.1/src/modules/lapack/README:

(...)

R 4.2.2 added

zgbcon zgbequ zgbrfs zgbsv zgbsvx zgbtf2 zgbtrf zgbtrs zgeevx zgtcon zgtrfs zgtsv zgtsvx zgttrf zgttrs zgtts2 zlagtm zlangb zlangt zlansy zlantb zlaqgb zlaqhe zlatbs zpbtf2 zpbtrf zpocon zpoequ zporfs zposv zposvx zpotrs zpstf2 zpstrf ztrsna

for recent RcppArmadillo with LLVM clang 15.

R 4.4.0 added

zrscl needed for zgetf2 in 3.12.0

zlansp zlantp zlatps zppcon zpptrf zpptri zpptrs zspcon zspr zsptrf zsptri zsptrs zsycon zsytrs ztpcon ztptri ztptrs zhecon zhetrf zhetri zhetrs zhpcon zhptrf zhptri zhptrs zlanhp (...)

coatless commented 1 week ago

@conradsnicta sure, I'll test it out.

That said, it's unlikely Dirk will move to bump the required version to R 4.4.0 as that's the current patch line and the majority of the community isn't there. For context, R has 3 flavors: oldrel (R 4.3.3), release (R 4.4.1), and devel (R 4.5*). No matter what we need to support oldrel. There is a strong preference for supporting at least 3 prior patches, e.g. R 4.1.z - 4.4.z.

conradsnicta commented 1 week ago

@coatless According to [1], major R releases are on approx yearly cadence, with a release in each April.

The current R 4.4.x release will become oldrel in April 2025. I'll need to release Armadillo 14.2 well before that. There is no urgent need to do a corresponding RcppArmadillo release when Armadillo 14.2 is released.

Another possible approach is that folks using old releases of R can always install old versions of RcppArmadillo, available from https://cran.r-project.org/src/contrib/Archive/RcppArmadillo/

Either way, moving forward I want to remove all ARMA_CRIPPLED_LAPACK workarounds from the Armadillo codebase, as they are a maintenance burden and are causing problems with new development.

I did a semi-automated check of all the LAPACK functions available in Rlapack as of R 4.4.1 against the LAPACK functions used by Armadillo. It appears that it is safe to remove all ARMA_CRIPPLED_LAPACK workarounds, though a proper reverse-deps check would be still required to be sure.

[1] https://en.wikipedia.org/wiki/R_(programming_language)

eddelbuettel commented 1 week ago

tl;dr: I would keep it, but possiblt replace the simple binary variable with a configure or cmake based check,

The reality is that we have little control over where and how our packages are used. From occassional bug reports, we know that some folks are forced to

Now, R still permits Rlapack-based builds (even if Linux distros typically do as I do for Debian and use external BLAS/LAPACK) and as @conradsnicta noticed, the coverage has been getting better release by release and is now (near?) complete. But as @coatless replied this does not resolve us as many older R installations are in use.

So in short we can do whatever we want as such a change will not be a blocker at CRAN where new-ish versions are used. It however with absolute certainty make life for a number of users on either older R or weird R a lot harder. Upstream Armadillo has to decide whether it wants to be that software. It is easy for RcppArmadillo to follow.

My recommendation is to keep the current setup [^1]

[^1]: Maybe until you convince yourself 'sufficient' coverage is achieved across Rlapack out there. I am unsure how to reliably compute that.

eddelbuettel commented 1 week ago

There is a strong preference for supporting at least 3 prior patches, e.g. R 4.1.z - 4.4.z.

Package data.table, which is rather well managed and has zero external dependencies, shoots for R 3.4.0 or later (which may be overdoing but Rcpp, for what it is worth, also does it via CI). Many other packages try for at least R 4.0.0. Whether or not some other popular assembly of packages uses three releases or not is largely irrelevant. We know from experience that old R versions are being used.

More importantly, we take pride in our 1100+ reverse dependencies. With that comes a responsibility to keep these users operational when we make changes. Imposing the most recent R release (or the one before it) is technically possible but still the wrong move in the bigger picture.

Now, ARMA_CRIPPLED_LAPACK is a nuisance and I can see how it leads to (a lot of) extra work. But the reality is as descibed above, and I think in all fairness we cannot make such a switch until R 4.4.* is at least three or four or five releases old. Now is not the time to do this.

conradsnicta commented 1 week ago

I will (reluctantly) retain handling of ARMA_CRIPPLED_LAPACK for now. However, as R 4.4.0 has all the necessary LAPACK functions in its Rlapack incantation, the ARMA_CRIPPLED_LAPACK option is deprecated and will be eventually removed.

coatless commented 1 week ago

@conradsnicta we appreciate it greatly. Trust me when I say I'll be singing in the street when all the LAPACK functions get appear in Rlapack across supported R versions.