Khrylx / PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.
MIT License
1.09k stars 186 forks source link

Implementation problem #27

Closed pengzhi1998 closed 3 years ago

pengzhi1998 commented 3 years ago

Hi Dr Yuan, firstly, thank you for your great work! However, when I implement PPO, the multiprocessing has a problem on my desktop. I'm using python3.6 with PyTorch 1.7.1 on Ubuntu 20.04 with CartPole-v0. In the first loop of i_iter in main_loop in ppo_gym.py, when evaluating the trained model, the code stucks. After testing, I find it is because in process-4 (one process created here, the action cannot be computed here which is weird. I didn't change anything, and I don't think the multiprocessing should affect the computing with the network. Do you have any idea why this happens?

pengzhi1998 commented 3 years ago

It seems that the computation of the network in the processes has been stuck for whatever reason. Is it caused by the PyTorch's version? (however I tried torch.0.4.0, the same problem happens)

Khrylx commented 3 years ago

Have you set the env variable as mentioned in README?

export OMP_NUM_THREADS=1
pengzhi1998 commented 3 years ago

Wow, it works this time, thanks! So this command will influence training in both CPU and GPU?

Khrylx commented 3 years ago

It will only affect CPU based training. I have added it in code: https://github.com/Khrylx/PyTorch-RL/blob/72069237b4d86bcb9675b899ea94228019a4f003/core/agent.py#L7

So you won't need to manually set it and it won't affect other programs.

pengzhi1998 commented 3 years ago

Thank you!!!

pengzhi1998 commented 3 years ago

Sorry to bother you Dr. Yuan, but I have one more question, that I didn't find termination or join operation for the processes in the code. Is there a need to join the workers after they have been used?