pranz24 / pytorch-soft-actor-critic

PyTorch implementation of soft actor critic
MIT License
810 stars 180 forks source link

Inconsistent seeding #32

Closed mohakbhardwaj closed 4 years ago

mohakbhardwaj commented 4 years ago

Hi,

I just wanted to point out that the code produces inconsistent results on multiple runs due to seeding issues. I have found two reasons for that, upon fixing which I am able to get consistent results for a fixed seed value.

  1. The Python random package is used in ReplayMemory. However, the seed for it is not set in main.py
  2. You would need set the seed for the action_space for the environment explicitly using env.action_space.seed(args.seed) as the env.seed(seed) function does not do that. This gives different action samples in the initial exploration phase of the algorithm. I am using Gym version 0.17.2

Hope this helps!

pranz24 commented 4 years ago

Thank you! :raised_hands:

I wasn't aware of the 2nd point. I'll fix the issue within this week.

pranz24 commented 4 years ago

Fix inconsistent seeding & clean up