Continuous action control

LucasAlegre / sumo-rl

Reinforcement Learning environments for Traffic Signal Control with SUMO. Compatible with Gymnasium, PettingZoo, and popular RL libraries.

MIT License

731 stars 197 forks source link

Discrete actions perform very well on PPO, but the on policy algorithm still suffers from the problem of low sampling efficiency, and it is difficult to find a off-policy method suitable for discrete actions (only DQN and the effect is not good).

Is there any plan to give a continuous action type control scheme, such as controlling the maintenance time of a certain phase, and the environment will perform the action according to that time until the end, at this time, the next state is given and the agent makes another decision.

LucasAlegre / sumo-rl

Continuous action control #201