ikostrikov / pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).
MIT License
3.56k stars 831 forks source link

log showing making new env twice #37

Closed CarloLucibello closed 6 years ago

CarloLucibello commented 6 years ago

Probably an upstream issue, but it puzzles me that it appears from the log that make_env is called twice for each processes instead of one (instead of once as it truly is)

[carlo@x1 pytorch-a2c-ppo-acktr]$ python main.py --env-name "BreakoutNoFrameskip-v0" --num-processes 2 --num-frames 100
#######
WARNING: All rewards are clipped or normalized so you need to use a monitor (see envs.py) or visdom plot to get true rewards
#######
[2017-12-05 17:23:30,700] Making new env: BreakoutNoFrameskip-v0
[2017-12-05 17:23:30,702] Making new env: BreakoutNoFrameskip-v0
[2017-12-05 17:23:30,947] Making new env: BreakoutNoFrameskip-v0
[2017-12-05 17:23:30,948] Making new env: BreakoutNoFrameskip-v0
Updates 0, num timesteps 10, FPS 41, mean/median reward 0.0/0.0, min/max reward 0.0/0.0, entropy 1.38142, value loss 0.00155, policy loss 0.02737
ikostrikov commented 6 years ago

I think it happened in the old version of gym.

They fixed it recently.