Open Tendimension opened 7 years ago
@Tendimension Do you find any reason? I have the same problem. The avg. reward is 2.0 and std. is 0.0 until 20 million frames. Is the reward going up after some period?
@kkjh0723 I do not know what the reason is.
@Tendimension @kkjh0723 I also found this bug. I will check it soon.
@yao62995 Thanks a million!
@yao62995 Do you have any updates on this problem?
I find the same issue. The average reward is still 0.0 after 1 million steps.
I have run 20 million frames(time-steps) in the Breakout environment, but the average reward has not changed. After about 17 million steps, the average reward has changed in Asynchronous Methods for Deep Reinforcement Learning. I do not know where the problem is?