MJeremy2017 / reinforcement-learning-implementation

Reinforcement Learning examples implementation and explanation
MIT License
318 stars 242 forks source link

Potential Error in Class: Dyna-Q+: should not reset self.time to zero #17

Open MariosGkMeng opened 1 year ago

MariosGkMeng commented 1 year ago

I think that in the Dyna-Q+ class, the reset method should not include the command: self.time = 0, since this will create the issue of NaN values in the Q-function, due to the fact that we will have self.time - _time < 0, which is then inserted in a square-root!

in Line: https://github.com/MJeremy2017/reinforcement-learning-implementation/blob/0fecb49bc674f7269e5456cd0d978588e3199761/DynaMaze/DynaQ%2B.py#L116