Closed turmeric-blend closed 3 years ago
Hi! Thanks for the great question!!
Regarding the code example you wrote, yes, that is the intended usage of the Resample
layer. The covmat
and rets
determine uniquely the Multivariate Normal distribution and each forward pass just samples from it multiple times and runs the underlying allocator.
Regarding the sample
vs rsample
method, I need to read up on this and I will get back to you ASAP. When I was implementing it I was assuming sample
would not break the computational graph and one can backpropagate without any issues. When running tests and experiments there were never any problems. However, I might be wrong here.
I digged into it and you were absolutely right. The gradients were not able to flow through the sample
operation. I was kind of surprised that loss.backward()
goes through anyway without raising exceptions. However, the tensor.grad=None
and a gradient descent step seems to treat it as if the gradients were zero.
Thank you very much for pointing this out!
Hi, I want to ask if the below is the correct implementation of the Resample Allocator Layer? In this case
covmat = covariance_layer(Q)
andrets = transform_layer(x)
are used to initialize theMultivariateNormal
.EDIT:
on another note, for the sampling part of the
Resample Allocator
, https://github.com/jankrepl/deepdow/blob/eb3b02d0f5d9660035845105ffaa126a170dce6e/deepdow/layers/allocate.py#L335 might want to consider the reparameterization trick as well, just change todist.rsample((n_draws,))
. It allows for backprop through the random node and it is used in reinforcement learning's SAC as well. Although I am not sure about how it affects performance.