Closed Devin-Taylor closed 4 years ago
initially was to investigate using kl_loss from authors code (which is wrong) but it did not seem to change anything. Think it is useful to log kl_loss though so just adding that in
initially was to investigate using kl_loss from authors code (which is wrong) but it did not seem to change anything. Think it is useful to log kl_loss though so just adding that in