PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning
https://parl.readthedocs.io/
Apache License 2.0
3.22k stars 816 forks source link

How to run `benchmark/torch/AlphaZero`? #997

Open rydeveraumn opened 1 year ago

rydeveraumn commented 1 year ago

Hey all,

I am wondering how I can run the benchmark/torch/AlphaZero code. When I follow the instructions the code does not run out of the box. I would like to train a model and submit to Kaggle. I will add all of the things I am seeing so far when I try to run it.

TomorrowIsAnOtherDay commented 1 year ago

Hello. Have you run all the three steps in the instructions?

rydeveraumn commented 1 year ago

Hey @TomorrowIsAnOtherDay is I sure have. There are a number of different errors that seem to be happening for me:

Haven't got farther than this. I was able to run examples/AlphZero but then the gen_submission.py had errors on Kaggle. Not sure what next steps should be.

Thanks for the reply!

BTW:

parl.__version__ = 2.0.5 paddlepaddle.__version__ = 2.3.2

Steps: Clone develop repository xparl start --port 8010 --cpu_num 5 -- everything works fine cd benchmark/torch/AlphaZero python main.py

TomorrowIsAnOtherDay commented 1 year ago

It seems that you have installed two deep learning frameworks (torch & paddle). PARL supports these two frameworks, but it imports paddle by default. To specify torch as the backend framework, try exporting the following environment variable:

xparl stop
export PARL_BACKEND = torch

and run the instruction again.

rydeveraumn commented 1 year ago

Okay let me give it a shot! Will report back soon

rydeveraumn commented 1 year ago

Now I am getting a ton of No vacant CPU resources at the momemt errors image

image

rydeveraumn commented 1 year ago

@TomorrowIsAnOtherDay it does show that I have 5 vacant cpus: image

rydeveraumn commented 1 year ago

Okay I think I figured out what was happening:

When you are setting a different number of CPUs than the one suggested in the README I also had to modify the numActors in the main.py script and depending on what you do there you also have to modify arenaCompare. If you want I can add a small PR just incase anyone is interested in using this

TomorrowIsAnOtherDay commented 1 year ago

Exactly. The commands related to xparl only launch a CPU cluster, and they will not change the number of actors used for training. Users must modify the num_actors in main.py to change the number of actors.