Closed zhouzypaul closed 3 years ago
other scripts in examples/
don't suffer from this issue because they use functools.partial
Oh, this is definitely a bug that affects performance. Thanks for pointing it out! I can reproduce:
diff --git a/examples/atari/train_ppo_ale.py b/examples/atari/train_ppo_ale.py
index aa6e107..11d8a17 100644
--- a/examples/atari/train_ppo_ale.py
+++ b/examples/atari/train_ppo_ale.py
@@ -174,6 +174,7 @@ def main():
print("Output files are saved in {}".format(args.outdir))
def make_env(idx, test):
+ print(f"make_env called with idx: {idx}")
# Use different random seeds for train and test envs
process_seed = int(process_seeds[idx])
env_seed = 2 ** 32 - 1 - process_seed if test else process_seed
python examples/atari/train_ppo_ale.py --gpu -1
Output files are saved in results/13eb97e0e87096517935edba7673b448b2bec64f-2da9143a-67bc6343
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
Observation space Box(4, 84, 84)
Action space Discrete(4)
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
make_env called with idx: 7
resolved by https://github.com/pfnet/pfrl/pull/157
https://github.com/pfnet/pfrl/blob/7b0c7e938ba2c0c56a941c766c68635d0dad43c8/examples/atari/train_ppo_ale.py#L199-L200
On the line above, there is a lazy execution issue with
lambda
when called like this with list comprehension, the expression
(lambda: make_env(idx, test))
is never actually evaluated until it's been called. When it is being called inpfrl.envs.MultiprocessVectorEnv
, the for loop has looped through andidx == args.num_envs
for all thelambda
s in the listAs a result, all the envs in the MultiprocessVectorEnv are actually seeded by the same seed.