Closed nicolasch96 closed 3 years ago
I found a solution by using:
def kl_divergence_approx(mean_1, std_1, mean_2, std_2):
normal0 = dist.Normal(mean_1, std_1)
normal = dist.Normal(mean_2, std_2)
total = (1 + 2*dist.kl.kl_divergence(normal0, normal)).sum()
return total
Hi @nicolasch96,
Thanks a lot for investigating and contributing!
The code will be updated!
Hello thanks for taking into consideration my comment. Please make sure to use:
kl_divergence_approx(normal0.mean, normal0.scale, normal.mean[cc], normal.scale[cc])
instead of:
kl_divergence_approx(normal0.mean, normal0.variance, normal.mean[cc], normal.variance[cc])
Hello again :)
Thank you very much for the code it is really useful! I am trying to recreate the results of the paper with monte carlo simulations. I was testing the computational time with the GPU code. However, I noticed that when cuda=True, the kldivs is very slow while cuda=False it is much faster. To give an example, for a matrix of size 80x80, the KL divs time is around 5.16 seconds for cuda=False and 19.16 seconds for cuda=True. I am using an RTX2060 super, and I think this is causing a bottleneck for the GPU code which makes it as fast as the CPU code (I mean for cuda= True or False) (batch time for N=80 is around 20 seconds)
Thank you again !
Nicolas