Closed xiangpin closed 6 months ago
The fundamental problem is that arma::shuffle
makes use of R's RNG when used via RcppArmadillo
, which must not be used from multi-threaded code. Here, dqrng
can indeed help. You can make use of the second example on https://cran.r-project.org/web/packages/dqrng/vignettes/parallel.html. The idea is the following:
std::shuffle
// [[Rcpp::depends(RcppArmadillo,dqrng)]]
#include <algorithm>
#include <RcppArmadillo.h>
#include <xoshiro.h>
#include <convert_seed.h>
#include <R_randgen.h>
#include <omp.h>
using namespace arma;
arma::rowvec calkld(arma::rowvec z, arma::rowvec bg){
arma::rowvec x = z % log(z / bg);
return x;
}
arma::rowvec calmeankld(arma::rowvec x, arma::rowvec bg, dqrng::xoshiro256plus rng, int permutation = 100){
arma::vec bootkld(permutation);
for (int j = 0; j < permutation; j++){
arma::rowvec z = x;
std::shuffle(std::begin(z), std::end(z), rng);
arma::rowvec k = calkld(z, bg);
bootkld[j] = log(sum(k));
}
double bmean = arma::mean(bootkld);
double bsd = arma::stddev(bootkld);
arma::rowvec res = {bmean, bsd};
return res;
}
//[[Rcpp::export]]
arma::mat testparallel(arma::mat m, int permutation = 100){
m = m + 1e-300;
int n = m.n_rows;
arma::mat out(m.n_rows, 2);
arma::rowvec bg = mean(m, 0);
bg = bg + 1e-300;
// seed RNG from R's RNG state
Rcpp::IntegerVector seed(2, dqrng::R_random_int);
dqrng::xoshiro256plus rng(dqrng::convert_seed<uint64_t>(seed));
// Parallelize the Loop
#pragma omp parallel for schedule(static)
for (unsigned int i = 0; i < n; i++){
dqrng::xoshiro256plus lrng(rng); // make thread local copy of rng
lrng.long_jump(omp_get_thread_num() + 1); // advance rng by 1 ... jumps
out.row(i) = calmeankld(m.row(i), bg, lrng, permutation);
}
return out;
}
/*** R
da <- matrix(abs(rnorm(200)), ncol=10)
withr::with_seed(123, testparallel(da, permutation=100)) -> t1
withr::with_seed(123, testparallel(da, permutation=100)) -> t2
identical(t1, t2)
*/
Note: I am restricting myself to functionality that is available in the current CRAN release of dqrng
.
Thank you very much, I believe I have a solution to this problem.
Great! Can you tell me what kid of documentation would have helped you finding this?
The codes which you modified and the idea greatly assisted me. But I want to using dqrng
in an R package, how should I using it. I have added it to Linking to
in DESCRIPTION
, and added #<dqrng.h>
in the cpp
file, but it does not work.
error: ‘dqrng’ has not been declared
The code I posted should work unchanged in a package with LinkingTo: RcppArmadillo, dqrng
. With the current CRAN of dqrng version, there is no point to include dqrng.h
for parallel work, but if you still want to do it, you have to use #include <dqrng.h>
.
I found the error might be introduced by exporting the function containing dqrng::xoshiro256plus rng
argument.
for the example code
// [[Rcpp::export]]
arma::rowvec calmeankld(arma::rowvec x, arma::rowvec bg, dqrng::xoshiro256plus rng, int permutation = 100)
I removed the // [[Rcpp::export]]
, then it works
That makes sense. In your original code calmeankld()
is meant to be called within the parallel code. It does not make sense to make that callable from R via // [[Rcpp::export]]
.
I am closing this issue now since questions seem to be answered.
Good Day
Thanks for developing the package, I using
shuffle
ofArmadillo
withRcppParallel
orOpenMP
, but the result is not repeatable. I've been searching on Google for some time yet haven't come across a solution. someone suggest this package, but I can't find the solution. Could you help me? thank you.