gaes = (gaes - gaes.mean()) / gaes.std()

uidilr / gail_ppo_tf

Tensorflow implementation of Generative Adversarial Imitation Learning(GAIL) with discrete action

MIT License

112 stars 29 forks source link

Closed Joll123 closed 4 years ago

Joll123 commented 4 years ago

What does this formula mean in ppo? Thanks

uidilr commented 4 years ago

I can't remember clearly, but I think it is for stability. Similar question and answer is found in the link below. Hope it answers your question!