AndreaVidali / Deep-QLearning-Agent-for-Traffic-Signal-Control

A framework where a deep Q-Learning Reinforcement Learning agent tries to choose the correct traffic light phase at an intersection to maximize traffic efficiency.
MIT License
405 stars 146 forks source link

Mean waiting time of an episode suddenly drops really low #36

Open TheMedicineSeller opened 1 year ago

TheMedicineSeller commented 1 year ago

I was training a variant of the system developed in this project where I have 2 separate Traffic Lights and separate states and rewards for each of the agents. I have set N_EPOCHS to 100 and increased no of episodes to 300. After about 60 episodes the mean waiting time of both of the agents drops drastically, from around (-2000) - (-3000) range to -120000 which is really weird. It also stopped improving and i don't see any convergence in the future. I wanted to know some possible causes for this drop in performance. I noticed the vehicles started teleporting (because of waiting too long ) exactly after the 61st episode which seems suspicious.

rl_train