Closed randy3k closed 6 years ago
Hey @randy3k
First off, thanks for the kind words. It is difficult to know sometimes if what you have produced is useful or not. Of course, we can look at number of downloads and things like that, but nothing eclipses a word of approval from a true expert such as yourself.
Now about exposing the nth result, you are correct, there is a bit of overhead when retrieving the nth result in the manner you present.
This has already been implemented in the sampling functions by making use of the sampleVec
argument. See the example code at the bottom.
Before I released version 2.0.0, I initially had an nthCombination
and an nthPermutation
function precisely for retrieving specific nth results. However, as development evolved and the sampling functions emerged, it made more sense to me to incorporate this capability into the sampling functions.
Underneath these sampling functions, we generate a vector of random indices using the sample
function, passing the total number of combinations/permutations to the x
parameter. This vector of random integers representing specific nth results is then used to retrieve those specific results. The input is sanitized once and passed to the SampleResults
function that retrieves each nth result unobstructed (whereas with the General
functions (i.e. combo/permuteGeneral
), you would have to sanitize the input for every nth result requested).
This naturally extends to simply passing a vector of indices (without relying on sample
) and thus exposing the nth results. Hopefully this makes sense and please do not hesitate to reply if further explanation is needed. Here is an example from the README file of how it can be used:
library(RcppAlgos)
## Use sampleVec to generate specific results
## E.g. the below generates the 1st, 5th, 25th, 125th, and
## 625th lexicographical combinations
comboSample(10, 8, TRUE, sampleVec = c(1, 5, 25, 125, 625))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 1 1 1 1 1 1 5
[3,] 1 1 1 1 1 1 3 8
[4,] 1 1 1 1 1 3 6 9
[5,] 1 1 1 1 5 6 10 10
## Is the same as:
comboGeneral(10, 8, TRUE)[c(1, 5, 25, 125, 625), ]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 1 1 1 1 1 1 5
[3,] 1 1 1 1 1 1 3 8
[4,] 1 1 1 1 1 3 6 9
[5,] 1 1 1 1 5 6 10 10
## Is the same as:
do.call(rbind, lapply(c(1, 5, 25, 125, 625), function(x) {
comboGeneral(10, 8, TRUE, lower = x, upper = x)
}))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 1 1 1 1 1 1 1 1
[2,] 1 1 1 1 1 1 1 5
[3,] 1 1 1 1 1 1 3 8
[4,] 1 1 1 1 1 3 6 9
[5,] 1 1 1 1 5 6 10 10
Below we have a benchmark that illustrates the overhead you point out when obtaining the nth results through the General
functions:
library(microbenchmark)
options(scipen = 999)
set.seed(101)
mySamp <- sample(comboCount(50, 30), 10000)
a <- comboSample(50, 30, sampleVec = mySamp)
b <- lapply(mySamp, function(x) comboGeneral(50, 30, lower = x, upper = x))
all.equal(a, do.call(rbind, b))
[1] TRUE
microbenchmark(t1 = lapply(mySamp, function(x) comboGeneral(50, 30, lower = x, upper = x)),
t2 = comboSample(50, 30, sampleVec = mySamp))
Unit: milliseconds
expr min lq mean median uq max neval
t1 95.86454 136.75282 154.9306 155.53314 171.89110 233.15580 100
t2 13.66146 14.50935 15.2127 15.04008 15.86957 18.27242 100
We have more than an order improvement (i.e. 10x faster)!
Any further improvements in efficiency should be mostly focused on improving the algorithms in the NthResults
source file. I am sure there is room for improvement as I developed these rather organically.
Thanks again! Joseph
Oh, thanks. I totally missed the sampleVec
argument.
By the way, thanks for the benchmark.! 👍
Hi @jwood000,
Thank you for your excellent job. I love the sampling functions
comboSample
andpermSample
. Those are the things I wanted to implement for a long time. Though I don't want to admit, your package is far more "useful" thanarrangements
👍. Now, the only time I need to callarrangements
is when I need to compute the numbers of permutations and combinations with gmp big integer (it may happens ifn
is large).Do you have any plans to also expose the nth results to R ? I believe there is a significant overhead to call
RcppAlgos::comboGeneral(10, 3, lower = 3, upper = 3)
when we just need nthCombination.