Closed kthohr closed 6 years ago
No, there isn't, sadly, as one @baptiste 's package still failed over a complex decomposition. As long as the (partial) LAPACK copy that ships with R and is used by default on Windows, we are bound to have these issues. I fought this twice already.
What is your underlying question? Are you missing a particular routine? Over the years it was generally just the somewhat more esoteric complex functions that were seen missing, and R Core helped twice and added them.
In general, until the R build changes and always uses an external (complete) LAPACK I fear that is little we can do better. Whenever we have left the guard down and said "we're good" we ended up being bitten by something we actually did not have.
For sure. I guess what I should have asked is: can we update the list of unavailable Lapack symbols on the Armadillo side? (What gets ifdef'd out by ARMA_CRIPPLED_LAPACK.) For example, I've tested qz() in the complex case on Windows (vanilla R v4.3) and there are no issues.
Is @baptiste 's package calling routines that weren't covered by the updates?
It think when @baptiste was bitten last it was due to a change on the Armadillo side when @conradsnicta invoked a different (sparse?) solver. I am a little fuzzy on the details of that conversation -- may have been mostly on the r-package-devel or rcpp-devel mailing lists.
As for
there are no issues
that tends to be my sentiment too. As of today, we do have 458 packages on CRAN that use RcppArmadillo. In general, things just work.
This README in the R sources has a good list of what is included. Again, for RcppArmadillo, ex ante we do not know which library we are facing hence the need to dial down. So we set ARMA_CRIPPLED_LAPACK
if we're on Windows, or if we see Rlapack.*
being used. That #define
then affects Conrad's code.
Sorry I wasn't clear with that example. Right now qz() for complex-valued matrices is ifdef'd out on the Armadillo side, as zgges was missing until R version 3.3.0. However, I've tested RcppArmadillo on Windows with that guard removed and it works fine. So my feeling is that it should be okay to start removing some of the guards based on the 3.3 update.
And again, to be crystal clear, that guard is in the Armadillo code, correct?
We could test this in a local patch we could try to noodle through CRAN. If it all works we can discuss with Conrad if/where/how the #ifdef is still needed. A quick recursive grep reveals that it is only used in
inst/include/armadillo_bits/auxlib_meat.hpp
but I fear we have to test each and every use case against different R versions and their Rlapack.*
. (Or at least 3.3.0 and later...)
That is quite a bit of work. Are you volunteering?
Yep, that guard is on the Armadillo side.
I'm happy to do the heavy lifting on this---I'm familiar with Rlapack and the internal Armadillo code. Would you prefer to start with the single qz case and then branch out to other functionality?
I really appreciate this. And the four-eyes principle is good. So how about something like this:
#ifdef
guardHow does that sound?
Sounds good to me.
For the record since my cda package was mentioned: the package actually broke for Linux on CRAN, which apparently also rely on the internal R Lapack (I thought it did only for Windows). The package itself does not use any unusual routine, it just calls solve() on a standard complex linear system, but Armadillo recently introduced a new special-case for banded matrices that would require, if it was ever called, a Lapack routine that R doesn't provide. Linux CRAN checks find this routine in the headers missing, and break (in contrast Windows is always flagged as crippled). Someone kindly provided a workaround (to basically flag Lapack as crippled) on the mailing list so that it would pass CRAN checks like on Windows, but I never found the time to test this (when I tried on my Linux box an external Lapack was used, and when I tried to switch that didn't seem to affect anything; maybe I'd need to compile R with some flags for the purpose of this test).
Bottom line: cda is now archived on CRAN; I hope to return to it someday.
It sounds like we're facing opposing problems: you want to tell CRAN not to flag 'crippled' now that a routine is available, while I'd need to enforce the flag manually because some routine is missing.
I thought it did only for Windows
It's a compile time option. On the non-Windows systems, if you do configure --help
you get (inter alia)
--with-blas use system BLAS library (if available), or specify it [no]
--with-lapack use system LAPACK library (if available), or specify it [no]
so the default may actually be to use the 'crippled' lapack (I think my local r-devel builds do that as I don't override it). But most distribution builds will probably use the external BLAS.
I supposed one can alter it on Windows too which must be how Microsoft injects the MKL.
Bottom line: "it's complicated" and when we build RcppArmadillo we have to track it.
@conradsnicta I think a suitably edited version of your question would be a useful discussion to have on R-devel. I don't know the reasons myself, but my guess would be that R has always been very conservative with respect to numerical accuracy, and R-core may have found it necessary to manually patch/review some routines on a case-by-case basis. Or it could be that the effort of writing all the appropriate wrappers and ensuring their correctness has always been the limiting factor, leading to this cherry-picked, add-only-when-necessary situation.
Here's a link to the last time R added a bunch of Lapack routines; I'm trying to figure out from the (large) diff what was involved in the process.
@conradsnicta as I said I don't know the reasons (and therefore should not have replied); I made a wild guess based on my recollection of discussions around flavours of Blas and Lapack (cf openBLAS and other alternatives), with R favouring a conservative (inefficient but more reliably accurate) choice by default. It may very well be completely irrelevant here; R-devel would have much more pertinent things to say.
This is a 30 year old discussion. Statistician prefer eg the QR solver with pivoting which is why a modified LINPACK gets used in the standard lm()
. But these days with double precision you have to work hard for finding actual differences. The example fastLm()
in RcppEigen has a worked example thanks to @dmbates , the one in for fastLm()
in RcppArmadillo does not but at least I put a note in the documentation.
And for 99.95% of all users the shipped LAPACK subset is actually sufficient. In all those years the only user I encountered who had issues was @baptiste . Now, his uses were all legit so we tried (and managed) to get this improved -- which can be done. But running around, screaming murder and calling other people names rarely does.
The part I have control over -- how R is built and shipped for Debian and hence Debian derivatives -- does that: full external Lapack. Other parts and builds I cannot control, but I invite you to engage with R Core and argue for change there.
I've emailed R-devel.
Thanks. In the meantime we can try refining the tests.
Thanks for starting the discussion, @baptiste . This is a bane of mine, too.
On the CRIPPLED_LAPACK
side, I've identified which guards can be removed; I'm just figuring out an intelligent way to test this to unit.tests. Dirk, would you prefer I add a new set of tests (say runit.Rlapack)?
Awesome news @kthohr !
Maybe we can discuss a bit more. What we currently have is blunt: one test, one variable, being used in a good half dozen instances inside that 'meat' file. Do you think we should have variables, or just do away with the ones you identified as being available?
As for new tests, if it is a new topic, a new top-level R file inst/unitTests/runit.$name.R
with a new matching cpp/$name.cpp
seems the best. Rlapack
works for me as $name
.
Oh, and should we remove the guard or refine the test to use when we know it works? Where do you think is the cutoff? R 3.3.0* ? That would be old enough not to worry.
I think we should remove the guards entirely, where possible. Here's a list of affected functions (with cutoffs):
eig_pair() for complex matrices
Lapack: zggev. Added: R 3.3.0
qz() complex matrices
Lapack: zgges. Added: R 3.3.0
svd_dc() and svd_dc_econ() for complex matrices
Lapack: zgesdd. Added: R 3.1.0
solve_square_refine() for complex matrices
Lapack: zgesvx. Added: R 3.3.0
solve_approx_svd() for complex matrices
Lapack: zgelsd. Added: R 3.1.0
schur() for complex matrices
Lapack: zgees. Added: R 3.3.0
So some are related to changes made in R version 3.1.0, but most are from version 3.3.0. The general CRIPPLED_LAPACK
guard will remain for some decompositions of band matrices (see chol_band
and solve_band_fast
).
@kthohr this is great stuff! Thank you for compiling this list.
So my quick and dirty changes can be found here:
devtools:::install_github("kthohr/RcppArmadillo",ref="lapack-checks")
(Tests are in RcppArmadillo.cpp instead of a separate unit test file, while I will of course change for the PR.) I have checked this on a Windows machine with our beloved Rlapack. ARMA_EXTRA_DEBUG
is enabled, so calling
library(RcppArmadillo)
set.seed(123)
n <- 5
norm(cx_eig_pair_test(n),"2")
norm(cx_qz_test(n),"2")
cx_rank_test(n) - n
norm(cx_solve_test(n),"2")
norm(cx_pinv_test(n),"2") - 1
norm(cx_schur_test(n),"2")
will show the internal pings of the Lapack functions. Any volunteers with broken Lapack distributions: please check this if you can.
Would the following make most sense?
runit.Rlapack
tests.Good plan. I agree that we should just remove the relevant test on the side of @conradsnicta / Arma.
I think I will play with your snippet above and submit this to the various winbuilders and friends. Maybe also add a simple
// [Rcpp::export]]
int whatAreWe() {
#if defined(ARMA_CRIPPLED_LAPACK)
return 1;
#else
return 0;
#endif;
}
to know where we are at.
Great! I just added a armadillo_Rlapack()
function to cover your suggestion.
On my end, I tested the code with fresh installations of R---versions 3.3.0 and 3.4.3---on Windows. I also checked against R-devel on my main system (macOS), but that uses a system supplied Lapack (the Accelerate library).
Are you set up for R-hub? That should give you a wide variety of different systems? (And I am tied up with something else...)
Nope, never worked with R-hub. And no rush on my side.
It is the awesomest thing ever. Brew a fresh cup and look into it:
That is the webview at https://builder.r-hub.io/; I tend to use rhub package...
@conradsnicta to get a fixed link (e.g. current commit / the canonical form) press Y and then copy the URL:
All keyboard shortcuts c.f.
https://help.github.com/articles/using-keyboard-shortcuts/#source-code-browsing
It's a little different. Release candidates do not go to CRAN. I only upload to CRAN once every two months -- it is too involved otherwise. Details in the ChangeLog and NEWS at the package page.
I do roll up interim releases, and do eg the reverse-depends checking I did today. But the systems used for those have full LAPACK anyway.
So I would suggest the following. We really only need two blocks commented out, so I suggest that for the upcoming release we keep things mostly as they are but change
#if defined(ARMA_CRIPPLED_LAPACK)
to
#if defined(ARMA_CRIPPLED_LAPACK) || defined(ARMA_CRIPPLED_LAPACK_BANDED)
and I define that new variable, but not the old one. That gives us the 2 out of 7 we need. If that passes, we clean up for the next release on remove the old define.
Thoughts?
@eddelbuettel - I don't think I follow the logic of that... where are we placing ARMA_CRIPPLED_LAPACK_BANDED
?
For reference, this is what I was going to submit on the Armadillo side: https://github.com/kthohr/armadillo-code/commit/1ac3e5de5c41fd52aabf5fa5e1a36c40fc08cf3a
Are we still keeping the guards around those blocks? Sorry for my confusion...
What I had in mind would simply be less invasive as a start. We simply add a new "or" with the new define in the two spots where banded is used.
In essence, this means that RcppArmadillo will need to depend on an R version greater than or equal to R 3.3.0 (per @kthohr's analysis). If the petition to r-devel is successful for R 3.5.*, then two years from now the dependency could increase to R 3.5.0 (unlikely due to the release schedule).
So I took the version of auxlib_meat.hpp
from @kthor and created a new minor test release in a branch.
That passes on the four default R Hub instances invoked from rhub::check_for_cran()
. In that version, we are down to four uses of the #define
. I think this could go into Armadillo 8.500 as is.
Thanks for taking the time to do that, @eddelbuettel. I'll pass a clean unit test to RcppArmadillo later today.
And if anyone wants to play along, this is now in this RcppArmadillo branch. Might even fold that into master.
(@kthohr I used your school email for the ChangeLog as you list it on you GH profile. We can adjust if you'd prefer a different one.)
Thanks, Dirk! I created a PR to push a unit test into that branch.
(I've also changed my e-mail on my GH profile.)
This is now merged in the master branch (as well as upstream for the actual change), so we could close this ticket -- or keep it open as a reminder of the four remaining #if defined
tests.
Thoughts?
Great! I think we can close the ticket now.
I think you are right -- this was after all about reflecting what R 3.3.0 brought us -- so thanks again for all your work on this.
Need to add an R depends on R 3.3.0
MIA from:
https://github.com/RcppCore/RcppArmadillo/pull/211 https://github.com/RcppCore/RcppArmadillo/pull/212
Sure, but we haven't released to CRAN (or even a drat repo...) either. On the TODO list.
Greetings,
A lot of Lapack routines were added with R version 3.3.0 (zgges, etc.), and the package configure script for Unix-alike systems correctly identifies this. However, for Windows systems, the assumption in RcppArmadilloConfig.h is that these routines are still missing. With R v3.5 being released in the next month or so, is now a good time to (safely) change this default behavior and assume most users have updated?
Best, Keith