Difference in the KL loss terms in the paper and the code

The algorithm for the KL balancing in the paper has the posterior and prior terms given as kl_loss = compute_kl(stop_grad(posterior), prior). So I had assumed that the code would have computed the loss as value = kld(dist(sg(post)), dist(prior)).

But instead the code has the terms reversed, with the KL loss formulated as (in networks.py, line 168) value = kld(dist(prior), dist(sg(post))).

Does that have something to do with the implementation of the kl divergence function in tensorflow_probability?

danijar / dreamerv2

Difference in the KL loss terms in the paper and the code #4