Closed imansaj closed 5 years ago
This is an issue with prioritised experience replay. Either you have to increase the amount of timesteps before learning starts, or reduce the priority exponent ω (setting this to 0 is equivalent to sampling uniformly).
Hello. In the memory file (line 101), the code stuck in the while loop for low learn_start parameter(like 200).