Open leandrolcampos opened 4 years ago
@xidulu @szhengac
Hi @leandrolcampos
Currently, pathwise gradient is only implemented for mx.np.random.{normal, gumbel, logistic, weibull, exponential, pareto}
in the backend.
We are planning to implement (in C++ backend) implicit reparam grad for Gamma related distribution in the future, which is extremely useful, as you pointed out, in scenarios like BBVI for LDA
.
Another possible solution, is to wrap the sampling Op as a CustomOp, which allows you to manually define the backward computation with Python. https://mxnet.apache.org/api/python/docs/tutorials/extend/customop.html
Hi @xidulu,
Thanks for your suggestion. I'll follow it. But, for performance reasons, I also look forward to your implementation (in C++ backend) of implicit reparam grad for Gamma related distribution.
Description
I'd like to suggest the implementation of implicit reparameterization gradients, as described in the paper [1], for the Gamma distribution: ndarray.sample_gamma and symbol.sample_gamma.
This will allow this distribution and others that depend on it, like Beta, Dirichlet and Student t distributions, to be used as easily as the Normal distribution in stochastic computation graphs.
Stochastic computation graphs are necessary for variational autoenecoders (VAEs), automatic variational inference, Bayesian learning in neural networks, and principled regularization in deep networks.
The proposed approach in the paper [1] is the same used in the TensorFlow's method tf.random.gamma, as we can see in [2].
Thanks for the opportunity to request this feature.
References