How to run `benchmark/torch/AlphaZero`?

PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning

https://parl.readthedocs.io/

Apache License 2.0

3.22k stars 816 forks source link

How to run `benchmark/torch/AlphaZero`? #997

Open rydeveraumn opened 1 year ago

rydeveraumn commented 1 year ago

Hey all,

I am wondering how I can run the benchmark/torch/AlphaZero code. When I follow the instructions the code does not run out of the box. I would like to train a model and submit to Kaggle. I will add all of the things I am seeing so far when I try to run it.

TomorrowIsAnOtherDay commented 1 year ago

Hello. Have you run all the three steps in the instructions?

rydeveraumn commented 1 year ago

Hey @TomorrowIsAnOtherDay is I sure have. There are a number of different errors that seem to be happening for me:

Expected more than 1 value per channel when training, got input size torch.Size([1, 128]) - which is coming from the batch normalization in connect4_model.py. I removed batch normalization then another error comes up
ValueError: optimizer got an empty parameter list - which seems to be coming from the fact that Connect4Model outputs an empty list when you call model.parameters().

Haven't got farther than this. I was able to run examples/AlphZero but then the gen_submission.py had errors on Kaggle. Not sure what next steps should be.

Thanks for the reply!

BTW:

parl.__version__ = 2.0.5 paddlepaddle.__version__ = 2.3.2

Steps: Clone develop repository xparl start --port 8010 --cpu_num 5 -- everything works fine cd benchmark/torch/AlphaZero python main.py

TomorrowIsAnOtherDay commented 1 year ago

It seems that you have installed two deep learning frameworks (torch & paddle). PARL supports these two frameworks, but it imports paddle by default. To specify torch as the backend framework, try exporting the following environment variable:

xparl stop
export PARL_BACKEND = torch

and run the instruction again.

rydeveraumn commented 1 year ago

Okay let me give it a shot! Will report back soon

rydeveraumn commented 1 year ago

Now I am getting a ton of No vacant CPU resources at the momemt errors

rydeveraumn commented 1 year ago

@TomorrowIsAnOtherDay it does show that I have 5 vacant cpus:

rydeveraumn commented 1 year ago

Okay I think I figured out what was happening:

When you are setting a different number of CPUs than the one suggested in the README I also had to modify the numActors in the main.py script and depending on what you do there you also have to modify arenaCompare. If you want I can add a small PR just incase anyone is interested in using this

TomorrowIsAnOtherDay commented 1 year ago

Exactly. The commands related to xparl only launch a CPU cluster, and they will not change the number of actors used for training. Users must modify the num_actors in main.py to change the number of actors.