Why do we reverse rewards?

ikostrikov / pytorch-a3c

PyTorch implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".

MIT License

1.23k stars 279 forks source link

Why do we reverse rewards? #72

Open npitsillos opened 4 years ago

npitsillos commented 4 years ago

I apologise is this is not the correct place but I didn't find anything elsewhere. Why are rewards reversed and why do we append R to the values at the end.