LucasAlegre / sumo-rl

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.
https://lucasalegre.github.io/sumo-rl
MIT License
746 stars 201 forks source link

Test already trained A3C policy on 4x4 grid #137

Closed sigmarco3 closed 1 year ago

sigmarco3 commented 1 year ago

Hi , i want to test a policy already trained on a single intersection with the dqn algorithm of stable_baselines3, in a multi-agent environment as in the a3c_4x4grid example but without using the train() method because it has already been done. How could this be done?

LucasAlegre commented 1 year ago

Hi,

If you need to check how to save, load and evaluate an agent in stable-baselines3: https://stable-baselines3.readthedocs.io/en/master/

Please open an issue in their github repository, as this is not related to sumo-rl.

smarianimore commented 1 year ago

If I understand correctly the OP question I have a similar issue: what is the correct way to exploit an already trained policy within a SUMO-RL environment. I know how to save and load a model in stable-baselines3 and Ray Rllib, for instance, but I'm uncertain about how the code in a SUMO-RL environment should look like.

Let's take for instance the example at https://github.com/LucasAlegre/sumo-rl/blob/master/experiments/a3c_4x4grid.py: how would lines 39--51 change if I already have a policy trained that I do not want to re-train?

Should they be changed to look like lines 73--75 in example https://github.com/LucasAlegre/sumo-rl/blob/master/experiments/sb3_grid4x4.py?

LucasAlegre commented 1 year ago

Hi!

The way to evaluate a policy in SUMO-RL is the same as in any Gymnasium/PettingZoo environment. There is nothing particularly different for SUMO-RL.

For instance, you can use the function from stable_baselines3.common.evaluation import evaluate_policy