LucasAlegre / sumo-rl

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.
MIT License
746 stars 201 forks source link

Test already trained A3C policy on 4x4 grid #137

Closed sigmarco3 closed 1 year ago

sigmarco3 commented 1 year ago

Hi , i want to test a policy already trained on a single intersection with the dqn algorithm of stable_baselines3, in a multi-agent environment as in the a3c_4x4grid example but without using the train() method because it has already been done. How could this be done?

LucasAlegre commented 1 year ago


If you need to check how to save, load and evaluate an agent in stable-baselines3:

Please open an issue in their github repository, as this is not related to sumo-rl.

smarianimore commented 1 year ago

If I understand correctly the OP question I have a similar issue: what is the correct way to exploit an already trained policy within a SUMO-RL environment. I know how to save and load a model in stable-baselines3 and Ray Rllib, for instance, but I'm uncertain about how the code in a SUMO-RL environment should look like.

Let's take for instance the example at how would lines 39--51 change if I already have a policy trained that I do not want to re-train?

Should they be changed to look like lines 73--75 in example

LucasAlegre commented 1 year ago


The way to evaluate a policy in SUMO-RL is the same as in any Gymnasium/PettingZoo environment. There is nothing particularly different for SUMO-RL.

For instance, you can use the function from stable_baselines3.common.evaluation import evaluate_policy