Fix categorical policy - Githubissues

keiohta / tf2rl

TensorFlow2 Reinforcement Learning

MIT License

461 stars 104 forks source link

Fix categorical policy #140

Open keiohta opened 2 years ago

keiohta commented 2 years ago

The current implementation of CategoricalActor includes some bugs that should be solved. At least I found the following now:

input to the tfp.distributions.Categorical is wrong.
computation of log probability is wrong in call

There could be some other issues, so needed to evaluated on several discrete environments.