Open gabrieledcjr opened 5 years ago
In the equation in the paper, there is no entropy term in the SIL policy loss, how come in the code there is one?
self.loss = self.pg_loss - entropy * self.w_entropy
In the equation in the paper, there is no entropy term in the SIL policy loss, how come in the code there is one?
self.loss = self.pg_loss - entropy * self.w_entropy