Closed WarBean closed 7 years ago
@WarBean You can first sample values from Uniform(0, 1) and then find which region in the CDF contains the value. For example, assume your probability distribution is prob
, you can do the sampling via numpy.searchsorted(numpy.cumsum(prob), rng.rand())
. It would be great if you could implement such a sampler for MXNet.
You can find the code of the supported sampling ops here https://github.com/dmlc/mxnet/blob/nnvm/src/operator/tensor/sample_op.h. Also, you may need to view the code in the MShadow side, which does the real implementation.
We should be able to write all common sampling functions (Gamma, T-distribution, Dirichlet, etc.) with the basic ones provided in cuRAND like Uniform, Normal, LogNormal, Poisson.
@sxjscience After reading sample_op.h
I have some idea on how to continue for a CPU version (something logic like numpy.searchsorted(numpy.cumsum(prob), rng.rand())
as you mentioned) . But I'm not sure how to continue on for GPU version. More precisely, I don't know how to integrate cuda kernel api to sample in parallel within a mini-batch. Could you offer some direction on how to directly use cuda kernel (calling a function with __device__
qualifier) in Forward
, Backward
function? Or there is some other approaches?
@WarBean You can directly call cuda kernels in the forward & backward functions. https://github.com/dmlc/mxnet/blob/master/src/operator/roi_pooling.cu is one example.
@sxjscience Exactly what I need. I will try it.
mxnet now (since a week ago) has a lot of additional sampling operators, see sample_op.h/cc and multisample-op.h/cc The latter version is one where parameters for the distributions are input tensors. So far we implemented exponential/gamma/poisson/negative binomial. And just for CPUs. Will look whether we can add CPU-support for multinomial easily. Getting them all on GPUs as well is also something we will look at.
@asmushetzel What's the progress now for the GPU support?
Almost done. I hope we can get the pull request out by next week. This will also change the CPU implementations as we are using generic code that is equivalent on GPU & CPU. One thing: This is a thread specifically about multinomial distribution. This distribution was brought in independently by Eric some time ago and supports CPU/GPU already. So not sure why this issue here is still open. Shouldn't we close it?
OK, do we have a separate issue for the general sampling OPs? I'm going to close it as it's only about multinomial, which has been supported. https://mxnet.incubator.apache.org/api/python/symbol.html#mxnet.symbol.sample_multinomial
We don't have a separate issue for general sampling. Feel free to open one, otherwise I will put you on cc on the upcoming pull request
We already have multinomial and it works on both cpu and gpu
Tensorflow use operator multinomial for sampling inside computational graph. It is useful for schedule sampling training of RNN. Recent work on generating sentences by GAN (seqGAN) also relies on
tf.multinomial
in their implementation.I suppose that without such operator, it's troublesome to integrate sampling into computation flow during training phase. So I try to implement it myself. However, after some research I found no way to perform discrete distribution sampling in cuda API. Here's a unsuccessful trail: (Generating sample from a non uniform discrete distribution.)
Off course I can use
mx.operator.CustomOp
to write a Numpy version, but it's the last choice for me.