Open vgherard opened 3 years ago
Looking a little bit closer, Gumbel-Max is simply a reformulation of the _rank()
algorithm in log-probability space.
I'm surprised to see that the R version is so fast (and, lovely, to be sure!). This is probably due to inefficient C++ code, since the R version has time complexity O(n log(n) + size), whereas my C++ version has complexity O(n + size log(n))).
Thanks. This is the first time I hear the name "Gumbel-Max", at the very least it's worth mentioning in the DESCRIPTION
and across the documentation. I don't remember taking the logarithm twice, though.
I'm not spending much time here these days. I'm more than happy to give you access to the repository so that you can make the changes.
I have a draft paper that discusses performance in Section 5.
Thank you, I will be definitely happy to add some references in the documentation!
The "Gumbel-Max" uses the same algorithm as the *_rank()
functions (except perhaps that I'm using partial sort, as you explain in the paper, while at least the R function does total sort). It uses g[k] = log(prob[k]) - log ( - log(unif(0,1) )
as order variable, the -log(-log(U))
are standard Gumbel variables. Thus, on a second thought, I'm not sure it is worth adding another function, I need to look a little bit closer to the C++ implementations.
Oh, thanks for the draft, it is interesting.
Valerio
Hi Kirill,
I've implemented an algorithm, call it
gsample()
, for sampling w/o replacement using the "Gumbel-Max trick" (more or less along the ideas presented in Sections 2.3 and 2.4 of this paper).I tried to benchmark
gsample()
against the functions inwrswoR
and it appears to be competitive with some of them, both from the space and time point of view (it beatsccrank()
,expjs()
andrej()
in several benchmark points, but is probably always outperformed by, e.g.,expj()
).I was wondering whether you might be interested in adding this algorithm to
wrswoR
, in which case I could provide a pull request trying to adapt my present code (which is inRcpp
). Actually, if you want to have a look, here's the repo.Cheers,
Valerio.