Epsilon Decay is done every step instead of episode

shivaverma / Orbit

Open source collection of Reinforcement Learning Environments.

MIT License

75 stars 31 forks source link

Epsilon Decay is done every step instead of episode #10

Open notBradPitt opened 2 years ago

notBradPitt commented 2 years ago

The epsilon decay in the code is under the module agent.replay() which is called every step, making the epsilon rapidly decline during the first episode. I don't know if this was the intended behavior, but I've gotten better result by making a separate module for the epsilon decay and calling it by the end of an episode.