PKU-Alignment / Safe-Policy-Optimization

NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms
https://safe-policy-optimization.readthedocs.io/en/latest/index.html
Apache License 2.0
330 stars 45 forks source link

why? #19

Closed Yifei-Bi closed 2 years ago

Yifei-Bi commented 2 years ago

Traceback (most recent call last): File "train.py", line 54, in if mpi_tools.mpi_fork(args.cores,use_number_of_threads=use_number_of_threads): File "/content/drive/MyDrive/save/Safe-Policy-Optimization/safepo/common/mpi_tools.py", line 97, in mpi_fork subprocess.check_call(args, env=env) File "/usr/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['mpirun', '-np', '2', '--use-hwthread-cpus', '/usr/bin/python3', 'train.py', '--env-id', 'Safexp-PointGoal1-v0', '--algo', 'ppo-lag', '--cores', '2', '--seed', '0']' returned non-zero exit status 1. [5e506354ec7b:06923] Process received signal why i have this question? please help me,thank you

zmsn-2077 commented 2 years ago

From your terminal info, it seems that mpi4py is having problems with the multi-threaded part, have you tried running: python train.py --env-id Safexp-PointGoal1-v0 --algo ppo-lag --cores 1 --seed 0 Will this command work?

Yifei-Bi commented 2 years ago

From your terminal info, it seems that mpi4py is having problems with the multi-threaded part, have you tried running: python train.py --env-id Safexp-PointGoal1-v0 --algo ppo-lag --cores 2 --seed 0 Will this command work?

我运行了上面的代码,然后出现上面的那个报错

zmsn-2077 commented 2 years ago

I'm very sorry for your bad usage experience, please allow me to answer your question in more detail.

Firstly, my command above is trying to get you to set cores to 1 to see if cores=1is possible. python train.py --env-id Safexp-PointGoal1-v0 --algo ppo-lag --cores 1 --seed 0

Secondly, can you run the following code to output the number of physical cores of the cpu on your local machine?

import psutil
physical_cores = psutil.cpu_count(logical=False)
print(physical_cores)

According to the parallelism mechanism of the mpi4py module, the number of physical cores of the cpu should be larger than the size of the cores parameter.

Finally, i just reinstalled the SafePO environment on a pristine ubuntu 20.04 machine and we tested it with no problems, can you provide more details about the configuration of your machine and the configuration of the software environment, this will help us to locate your problem faster. Your terminal information as above we are not able to locate where the problem is.

Hope this can help you.