RcppCore / RcppArmadillo

Rcpp integration for the Armadillo templated linear algebra library
193 stars 56 forks source link

arma::randu incorrect output #438

Closed pati-ni closed 7 months ago

pati-ni commented 7 months ago

Not sure why but randu reports a lot of zeros:

I don't have a minimal reproducible example ( for the compile command) but the following snippets are highlighting the issue.

#define ARMA_64BIT_WORD
#include <iostream>
#include <cstdlib>
#include <RcppArmadillo.h>
int main() {

  int N = 30000;
  arma::vec a(N, arma::fill::randu);
  arma::vec b(N, arma::fill::randu);

  std::cout << arma::accu(a) / N << std::endl;
  for (int i =0; i < N;++i){
    b(i) = rand()/(float)RAND_MAX;
  }

  std::cout << arma::accu(b) / N << std::endl;

  return 0;
}

I compile with the following:

g++ -std=gnu++17 -I"/usr/include/R/"  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/RcppArmadillo/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/RcppProgress/include' -I/usr/local/include   -O3  -fpic  -mtune=generic -pipe -fno-plt -fexceptions -flto=auto -ffat-lto-objects -L/usr/lib64/R/lib -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib64/R/lib -lR  randu_Rcpp.cpp -lR -o randu

This outputs the following:

$ ./randu
1.16415e-10
0.498628

The same code if I remove the Rcpp armadillo header and link with armadillo -larmadillo gives the correct output:

g++ -std=gnu++17 -I"/usr/include/R/"  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/RcppArmadillo/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.3/RcppProgress/include' -I/usr/local/include   -O3  -fpic  -mtune=generic -pipe -fno-plt -fexceptions -flto=auto -ffat-lto-objects -L/usr/lib64/R/lib -llapack -lblas -lgfortran -lm -lquadmath -L/usr/lib64/R/lib -lR  -larmadillo randu.cpp -lR -o randu
#define ARMA_64BIT_WORD
#include <armadillo>
#include <iostream>
int main() {

  int N = 30000;
  arma::vec a(N, arma::fill::randu);
  arma::vec b(N, arma::fill::randu);

  std::cout << arma::accu(a) / N << std::endl;
  for (int i =0; i < N;++i){
    b(i) = rand()/(float)RAND_MAX;
  }

  std::cout << arma::accu(b) / N << std::endl;
  return 0;
}
$ ./randu 
0.500482
0.498628
eddelbuettel commented 7 months ago

Before I dive into this any further are you aware that we on purpose and by default wire RcppArmadillo up with the R RNGs so that seeding from R works as expected? See inst/tinytest/test_rng.R (in the source, installed as tinytest/test_rng.R). There is also a config option to not to this if you'd rather have the default behavior, you would need this for your first example.

From glancing at RcppArmadilloForward.h it seems that setting a #define ARMA_USE_CXX11_RNG

pati-ni commented 7 months ago

I did not know that RNGs from R were used. Is it just the seed or the whole RNG? Regardless, this is not the expected behavior.

eddelbuettel commented 7 months ago

It is the expected and documented behaviour for the R package connecting Armadillo to R.

eddelbuettel commented 7 months ago

Also your first code example is just wrong. You never use R headers with a main(). If you want plain Armadillo for your own projects, I think you know where to get it. This package is for using Armadillo from R by via Rcpp.

> Rcpp::cppFunction("arma::mat outerProd(arma::vec& x) { return x * x.t(); }", depends="RcppArmadillo")
> outerProd(1:3)
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    2    4    6
[3,]    3    6    9
> 
pati-ni commented 7 months ago

Sure thing. I noticed this behavior in the Rcpp code I was writing for my package. I am no Rcpp Internals expert, and I wanted to see whether it can reproduced in a plain C++ file, hence the examples I posted.

eddelbuettel commented 7 months ago

For use from R, we connect to R's seeding and RNGs:

> Rcpp::cppFunction("arma::vec myrand(int n) { return arma::vec(n, arma::fill::randu); }", depends="RcppArmadillo")
> set.seed(123); myrand(4)
         [,1]
[1,] 0.287578
[2,] 0.788305
[3,] 0.408977
[4,] 0.883017
> set.seed(123); myrand(4)      # reproducibility from R which is what R users want
         [,1]
[1,] 0.287578
[2,] 0.788305
[3,] 0.408977
[4,] 0.883017
> myrand(4)                      # else random
          [,1]
[1,] 0.9404673
[2,] 0.0455565
[3,] 0.5281055
[4,] 0.8924190
> 
pati-ni commented 7 months ago

This works on my system also. So maybe this is something relating to the compile flags?

eddelbuettel commented 7 months ago

Can please calmly explain what you think your problem is? Your initial post (1st par above) is plain invalid. You cannot use RcppArmadillo that way. End of story.

If you think you have a deficieny in using RcppArmadillo from and with R, please document it here.

eddelbuettel commented 7 months ago

Also:

> Rcpp::cppFunction("arma::vec myrand(int n) { return arma::vec(n, arma::fill::randu); }", depends="RcppArmadillo")
> v <- myrand(1e6)
> summary(v)
       V1          
 Min.   :0.000001  
 1st Qu.:0.249158  
 Median :0.499153  
 Mean   :0.499528  
 3rd Qu.:0.749496  
 Max.   :1.000000  
> 

There are also many far more sophisticated ways to look at the distribution of a sampled vector in a statistical environment such as R ...

eddelbuettel commented 7 months ago

Also note that you used rand() whose manual page on my Ubuntu machine says not to use it:

NOTES

The versions of rand() and srand() in the Linux C Library use the same random number generator as random(3) and srandom(3), so the lower‐order bits should be as random as the higher‐order bits. However, on older rand() implementations, and on current implementations on different systems, the lower‐order bits are much less random than the higher‐order bits. Do not use this function in ap‐ plications intended to be portable when good randomness is needed. (Use random(3) instead.)

pati-ni commented 7 months ago

At this point, we need to agree upon a minimum reproducible example. I don't know why the inclusion of the main() makes the behavior invalid, but I am not here to dispute that. Do you want me to try and reproduce that by developing a small Rcpp package?

I came across this behavior while trying to initialize a vector with arma::randu while developing my Rcpp package. There, I did not include a main() call and I was getting the same behavior.

Maybe generating a dynamic library which is loaded during runtime (in R) can reproduce the behavior? (Pure speculation)

eddelbuettel commented 7 months ago

You seem fairly new to Rcpp and its tools so I will give you time to catch up. The Extending R with Rcpp introductory vignette included with the package (and a published paper) is a good start, it even covers simulation via random number generation. And you do not need a main (those build very differently) and you do not need a package (though a package is a good idea in general). I have shown you several one-line examples, you would find other for examples at the Rcpp Gallery. At this point it is most likely that you overlooked or misunderstood a small step somewhere so I will close this now. When you have an actual reproducible error by all means to come back and reopen. Until then, trust 15 years of RcppArmadillo in widespread use: it is a robust tool.

pati-ni commented 7 months ago

Also note that you used rand() whose manual page on my Ubuntu machine says not to use it:

NOTES The versions of rand() and srand() in the Linux C Library use the same random number generator as random(3) and srandom(3), so the lower‐order bits should be as random as the higher‐order bits. However, on older rand() implementations, and on current implementations on different systems, the lower‐order bits are much less random than the higher‐order bits. Do not use this function in ap‐ plications intended to be portable when good randomness is needed. (Use random(3) instead.)

There is this great talk that explains this in more detail: https://www.youtube.com/watch?v=LDPMpc-ENqY

I ended up using other std::mt19937 from the std c++ library:

https://github.com/pati-ni/harmony/blob/census-integration/src/utils.cpp#L26C8-L26C15

I will give myself some time catching up with Rcpp and will come again with more information, thanks

eddelbuettel commented 7 months ago

Absolutely. There are also a couple great RNG packages for R such as dqrng.

pati-ni commented 2 months ago

@eddelbuettel I was able to reproduce the behavior. The problem happens when the code at hand is loaded through a module. My guess is that the RNGs are not initialized properly when a module is loaded, and Rcpp::armadillo's random facilities are not working.

Based on your previous responses, I ignored the behavior because I thought I was misusing or loading something out of order. I just managed to track down the ill-behavior:

Steps to build the Rcpp module:

Generate the skeleton:

Rcpp::Rcpp.package.skeleton("armarand", module=TRUE)

Add the randu_test(int) function and a call from bla() following in the src/rcpp_module.cpp:

// Remove Rcpp.h (optional)
// Add RcppArmadillo.h
int randu_test(int K) {      
  for (unsigned int i = 0; i < K; i++) {    
    arma::vec random_numbers(1000, arma::fill::randu);
    std::cerr << "Min Rand: " << random_numbers.min() << " Max Rand: " << random_numbers.max() << std::endl;
  }
  return 0;
}

void bla() {
    randu_test(100);
    Rprintf("hello\\n");
}

Link to armadillo in DESCRIPTION

LinkingTo: Rcpp, RcppArmadillo

Steps to reproduce the issue:

Install the package:


devtools::install_local("armarand", force=TRUE)
## If you test bla() now, it works
q()

Start a new R session:


library(armarand)
bla()

Min Rand: 1.16415e-10 Max Rand: 1.16415e-10 .. Min Rand: 1.16415e-10 Max Rand: 1.16415e-10

Random workaround

sessionInfo()
library(armarand)
bla()

Min Rand: 0.00142129 Max Rand: 0.999444 ... Min Rand: 0.00437755 Max Rand: 0.997301

Invoking sessionInfo() causes the error to disappear, perhaps by doing what the module's loading omits.

SessionInfo

> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-conda-linux-gnu
Running under: Arch Linux

Matrix products: default
BLAS/LAPACK: /home/main/miniconda3/envs/Renv2024/lib/libopenblasp-r0.3.27.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] armarand_1.0

loaded via a namespace (and not attached):
[1] compiler_4.4.1   Rcpp_1.0.13      codetools_0.2-20

But I think this is not version-specific, as I have witnessed this in all kinds of different environments.

Attached is the module that is causing the misbehavior: armarand.zip .

Can you reproduce this??

eddelbuettel commented 2 months ago

Yes it is possible that you need to add to a modules initialization the code we otherwise use to get Armadillo random numbers to use the R RNG. It's probably just a matter of checking which headers files get added -- it's not complicated.

But a much simpler solution, when you desire a high-qualily uninform RNG, may be to simply use R's own.

eddelbuettel commented 2 months ago

Also, something must be different in your setup. If I build and install your module (and change it call only 5 times intead of 100) I get

> library(armarand)
> bla()  # reduced to 5 calls
Min Rand: 1.79007e-05 Max Rand: 0.999588
Min Rand: 0.000218502 Max Rand: 0.997155
Min Rand: 0.000336116 Max Rand: 0.998532
Min Rand: 0.00260145 Max Rand: 0.999044
Min Rand: 0.00113646 Max Rand: 0.999186
hello\nNULL
> 

I had first check whether a 'normal' function like the one above would work, of course it does:

> set.seed(123)
> myrand(4)
         [,1]
[1,] 0.287578
[2,] 0.788305
[3,] 0.408977
[4,] 0.883017
> 

Note that I didn't call myrand() or set.seed() or anything else before calling bla().

The only other change I made was to remove a few superfluous files (generated by the package generator) from src/ but that should make no difference. I now have

edd@rob:/tmp/r/arma/armarand$ ls src/
Num.cpp  randcall.cpp  RcppExports.cpp  rcpp_hello_world.cpp  rcpp_module.cpp  stdVector.cpp
edd@rob:/tmp/r/arma/armarand$ cat src/randcall.cpp 
#include <RcppArmadillo.h>

// [[Rcpp::export]]
arma::vec myrand(int n) {
  return arma::vec(n, arma::fill::randu);
}
edd@rob:/tmp/r/arma/armarand$ 

PS Files src/Num.cpp and src/stdVector.cpp can also be removed (and R/zzz.R be adjusted), file src/rcpp_hello_world.cpp can also be deleted for a more minimal solution that still works.

edd@rob:/tmp/r/arma/armarand$ R CMD INSTALL .
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘armarand’ ...
** using staged installation
** libs
using C++ compiler: ‘g++ (Ubuntu 13.2.0-23ubuntu4) 13.2.0’
ccache g++ -I"/usr/share/R/include" -DNDEBUG  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include'     -fpic  -O3 -Wall -pipe -pedantic -fdiagnostics-color=always -Wformat    -c RcppExports.cpp -o RcppExports.o
ccache g++ -I"/usr/share/R/include" -DNDEBUG  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include'     -fpic  -O3 -Wall -pipe -pedantic -fdiagnostics-color=always -Wformat    -c randcall.cpp -o randcall.o
ccache g++ -I"/usr/share/R/include" -DNDEBUG  -I'/usr/local/lib/R/site-library/Rcpp/include' -I'/usr/local/lib/R/site-library/RcppArmadillo/include'     -fpic  -O3 -Wall -pipe -pedantic -fdiagnostics-color=always -Wformat    -c rcpp_module.cpp -o rcpp_module.o
ccache g++ -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -Wl,-z,relro -o armarand.so RcppExports.o randcall.o rcpp_module.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/00LOCK-armarand/00new/armarand/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (armarand)
edd@rob:/tmp/r/arma/armarand$ Rscript -e 'library(armarand); bla()'
Min Rand: 0.000537164 Max Rand: 0.999262
Min Rand: 0.00312221 Max Rand: 0.999029
Min Rand: 0.000333866 Max Rand: 0.997299
Min Rand: 0.00141226 Max Rand: 0.998373
Min Rand: 0.000594264 Max Rand: 0.999548
hello\nNULL
edd@rob:/tmp/r/arma/armarand$ 
pati-ni commented 2 months ago

Thanks for your reply!

It's probably just a matter of checking which headers files get added -- it's not complicated.

Given that you can not produce the error, I don't think that's the issue here. Regardless, can you tell me which header needs to be added to do this properly?

> library(armarand)
> bla()  # reduced to 5 calls
Min Rand: 1.79007e-05 Max Rand: 0.999588
Min Rand: 0.000218502 Max Rand: 0.997155
Min Rand: 0.000336116 Max Rand: 0.998532
Min Rand: 0.00260145 Max Rand: 0.999044
Min Rand: 0.00113646 Max Rand: 0.999186
hello\nNULL

This is odd. Just to confirm, was this snippet the first code you executed after spawning R? In my system, Rscript -e "library(armarand); bla()" also fails to initialize correctly armadillo RNGs.

Can I ask for more info about your setup? sessionInfo(), BLAS distribution/configuration, and some hardware information (what processor)?

Finally, could there be some compilation flags that cause these issues? I can see that I have different ones here. One important thing to keep in mind is that I get this behavior only if I use the module skeleton!

R CMD INSTALL armarand
* installing to library ‘/home/main/R/x86_64-pc-linux-gnu-library/4.4’
* installing *source* package ‘armarand’ ...
** using staged installation
** libs
using C++ compiler: ‘g++ (GCC) 14.2.1 20240805’
g++ -std=gnu++17 -I"/usr/include/R/" -DNDEBUG  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/RcppArmadillo/include' -I/usr/local/include    -fpic  -mtune=generic -O2 -pipe -fno-plt -fexceptions                  -fstack-clash-protection -fcf-protection  -g -ffile-prefix-map=/build/r/src=/usr/src/debug/r -flto=auto -ffat-lto-objects  -c Num.cpp -o Num.o
g++ -std=gnu++17 -I"/usr/include/R/" -DNDEBUG  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/RcppArmadillo/include' -I/usr/local/include    -fpic  -mtune=generic -O2 -pipe -fno-plt -fexceptions                  -fstack-clash-protection -fcf-protection  -g -ffile-prefix-map=/build/r/src=/usr/src/debug/r -flto=auto -ffat-lto-objects  -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++17 -I"/usr/include/R/" -DNDEBUG  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/RcppArmadillo/include' -I/usr/local/include    -fpic  -mtune=generic -O2 -pipe -fno-plt -fexceptions                  -fstack-clash-protection -fcf-protection  -g -ffile-prefix-map=/build/r/src=/usr/src/debug/r -flto=auto -ffat-lto-objects  -c rcpp_hello_world.cpp -o rcpp_hello_world.o
g++ -std=gnu++17 -I"/usr/include/R/" -DNDEBUG  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/RcppArmadillo/include' -I/usr/local/include    -fpic  -mtune=generic -O2 -pipe -fno-plt -fexceptions                  -fstack-clash-protection -fcf-protection  -g -ffile-prefix-map=/build/r/src=/usr/src/debug/r -flto=auto -ffat-lto-objects  -c rcpp_module.cpp -o rcpp_module.o
g++ -std=gnu++17 -I"/usr/include/R/" -DNDEBUG  -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/Rcpp/include' -I'/home/main/R/x86_64-pc-linux-gnu-library/4.4/RcppArmadillo/include' -I/usr/local/include    -fpic  -mtune=generic -O2 -pipe -fno-plt -fexceptions                  -fstack-clash-protection -fcf-protection  -g -ffile-prefix-map=/build/r/src=/usr/src/debug/r -flto=auto -ffat-lto-objects  -c stdVector.cpp -o stdVector.o
g++ -std=gnu++17 -shared -L/usr/lib64/R/lib -Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now -flto=auto -o armarand.so Num.o RcppExports.o rcpp_hello_world.o rcpp_module.o stdVector.o -L/usr/lib64/R/lib -lR
installing to /home/main/R/x86_64-pc-linux-gnu-library/4.4/00LOCK-armarand/00new/armarand/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (armarand)

Can you also share the stripped-down package you crafted? I don't think something will change this, but I'd like to be on the same page.

I have used the conda distribution of R and my Linux distro's and I get the same symptoms.

eddelbuettel commented 2 months ago

Regardless, can you tell me which header needs to be added to do this properly?

What I had in mind are the few non-upstream Armadillo files in RcppArmadillo that set this up. You will see a few files in inst/include/RcppArmadillo/ as for example Alt_R_RNG.h. It gets included by the other files if a #define is set. This has worked for 10+ years for bazillions of deployments and compilations (including on Arch). But somehow your machine is different. That is something you may have to sort out at your end.

(It's a long shot but if you have Armadillo also installed as libarmadillo-dev (or alike) maybe hide those files just in case. It should not conflict as R takes its headers, ie RcppArmadillo ones here, first.)

(Note that there is also a unit test for RNG use. So this gets tickled on each test and each commit.)

pati-ni commented 2 months ago

(Note that there is also a unit test for RNG use. So this gets tickled on each test and each commit.)

Where is this unit test? Can you show me how can I invoke it from a given environment?

Do you have a conda deployment of R at hand? Conda tends to isolate the OS-level settings from the environment quite well and is also widely used by the scientific community.

I have created this environment using miniconda and this package armarand2.zip (Removed the Rcpp version dependency and reduced the number of iterations):

The script will install and test everything if conda is installed and activated:

conda create -n r_env -y r-essentials r-base
conda activate r_env
Rscript -e "install.packages('RcppArmadillo', repos='https://cloud.r-project.org/')"
rm -r armarand* # optional
wget https://github.com/user-attachments/files/17048904/armarand2.zip
unzip armarand2.zip
R CMD INSTALL armarand
Rscript -e "library(armarand); bla()"

PS: It is not just my Arch Linux machine; using the high-performance computing I can get my hands on (CentOS + conda), I can reproduce that. You should not dismiss this issue, given that conda is the go to solution for R used by thousands of users. If you can reproduce that, please re-open the issue.

EDIT: Just setting the seed works:

(r_env) 🌐 dn022  ~
31s❯ Rscript -e "library(armarand); bla()"
Min Rand: 1.16415e-10 Max Rand: 1.16415e-10
Min Rand: 1.16415e-10 Max Rand: 1.16415e-10
Min Rand: 1.16415e-10 Max Rand: 1.16415e-10
Min Rand: 1.16415e-10 Max Rand: 1.16415e-10
Min Rand: 1.16415e-10 Max Rand: 1.16415e-10
hello\n
(r_env) 🌐 dn022  ~
❯ Rscript -e "set.seed(0);library(armarand); bla()"
Min Rand: 0.00131466 Max Rand: 0.999931
Min Rand: 0.000605266 Max Rand: 0.998877
Min Rand: 0.000570522 Max Rand: 0.999863
Min Rand: 0.001162 Max Rand: 0.999195
Min Rand: 0.000200381 Max Rand: 0.998751
hello\n
(r_env) 🌐 dn022  ~
❯ Rscript -e "set.seed(0);library(armarand); bla()"
eddelbuettel commented 2 months ago

I am sorry. I do not do Conda, and have no Conda environment. It has tripped other people up. I can't help you with this aspect. I can only stress that you have not provide a reproducible example.

Unit tests are in the standard location for package tinytest i.e. inst/tinytest/. The C++ files uses are one below, you will see them.

I will have to push this back to you for debugging. You may need to go into (Rcpp)Armadillo and add some print or logging statements and figure which RNG is called and why. As demonstrated, all is fine here.

eddelbuettel commented 2 months ago

EDIT: Just setting the seed works:

That is a possibly a Conda thing you can likely fix. You need to trace back where/how the package ensures the seeding is called and work out why it doesn't happen under Conda. If you find something we can possibly adjust the package. (You could also play with just creating a random normal vector with Rcpp, ignoring Conda for a second, to see if the seed is set there.)

pati-ni commented 2 months ago

Thanks for the pointers.

It is a combination of issues and a very corner case. 1. The package armarand must be a module for this to take place (the unit test is not catching it) . 2. it works in your system but not in Arch Linux and conda.

Arch Linux is in constant flux, and things may have changed since you last deployed it. The configuration most likely has something funky to it at the moment, and it is affected. So, I am a bit confused because my technical knowledge of all the configuration flags is limited. I can see lots of flags set on the /etc/R/Makeconf + others that may be enabled by default in recent compilers.

When I load a module, which files make an impact on the initialization? I think it should be outside the module the module and environment-specific.

I read on another conda-related issue #264 that you stick to CRAN guidelines. Is your suggestion to build R from source on Arch? In your opinion, what configuration flags and compiler should I use to build R? Is there a CRAN guideline somewhere? And by the way, what R setup have you tested this on, Debian?

eddelbuettel commented 2 months ago

See https://cloud.r-project.org/web/checks/check_results_RcppArmadillo.html -- it is Debian and Fedora and macOS (both arm64 and x86_64) and Windows, with different compilers and different versions. That is our benchmark. We have matched it for well over a decade. RcppArmadillo earned that trust.

If is your setup where things deviates, and you now need to debug and drill down about where things differ. The seeding issue is an interesting starting point. I had a quick look, and I don't we explicitly seed within the package. R can be assumed to be good:

Random draws without a seed, draws are valid on U(0,1)

edd@rob:~$ Rscript -e 'runif(2)'
[1] 0.5402406 0.0893093
edd@rob:~$ Rscript -e 'runif(2)'
[1] 0.0134442 0.6901230
edd@rob:~$
edd@rob:~$ Rscript -e 'Rcpp::cppFunction("NumericVector foo() { return(Rcpp::runif(2, 0, 1)); }"); foo()'
[1] 0.418930 0.627817
edd@rob:~$ Rscript -e 'Rcpp::cppFunction("NumericVector foo() { return(Rcpp::runif(2, 0, 1)); }"); foo()'
[1] 0.758659 0.369780
edd@rob:~$

Predicably Repeating with Seed

edd@rob:~$ Rscript -e 'set.seed(123); runif(2)'
[1] 0.287578 0.788305
edd@rob:~$ Rscript -e 'set.seed(123); runif(2)'
[1] 0.287578 0.788305
edd@rob:~$ Rscript -e 'Rcpp::cppFunction("NumericVector foo() { return(Rcpp::runif(2, 0, 1)); }"); set.seed(123); foo()'
[1] 0.287578 0.788305
edd@rob:~$ Rscript -e 'Rcpp::cppFunction("NumericVector foo() { return(Rcpp::runif(2, 0, 1)); }"); set.seed(123); foo()'
[1] 0.287578 0.788305
edd@rob:~$ 

You can get yourself free cpu minutes on other systems. GitHub Actions runs Ubuntu and macOS and Windows, with Ubuntu you can use containers for Arch and whatnot. You could try posit.cloud and Google Colab, both give 15 or so free hours / months. I also do not have reason to believe Arch is at fault: it is widely-enough used.

I understand that this may be frustrating to you but in the absence of reproducible issues on CRAN-relevant system my ability to help is limited. I am sure you understand.