I just wanted to point out that the code produces inconsistent results on multiple runs due to seeding issues. I have found two reasons for that, upon fixing which I am able to get consistent results for a fixed seed value.
The Python random package is used in ReplayMemory. However, the seed for it is not set in main.py
You would need set the seed for the action_space for the environment explicitly using env.action_space.seed(args.seed) as the env.seed(seed) function does not do that. This gives different action samples in the initial exploration phase of the algorithm. I am using Gym version 0.17.2
Hi,
I just wanted to point out that the code produces inconsistent results on multiple runs due to seeding issues. I have found two reasons for that, upon fixing which I am able to get consistent results for a fixed seed value.
env.action_space.seed(args.seed)
as theenv.seed(seed)
function does not do that. This gives different action samples in the initial exploration phase of the algorithm. I am using Gym version 0.17.2Hope this helps!