Open mpadge opened 6 years ago
I wonder if it would be worthwhile to consider the speed improvements of using RcppParallel - looking at this example here where they create a distance matrix, they get a 5.5x speedup to regular Rcpp code, which is pretty nice.
I'm happy to look into using RcppParallel if that would be helpful?
This would require a pretty extensive restructure of current code, but the kind of unrolling I discovered via @njtierney here, and explained in this blog post might offer even more speed improvements?