multi-commander / Multi-Commander

Multi & Single Agent Reinforcement Learning for Traffic Signal Control Problem
Apache License 2.0
116 stars 30 forks source link

Explanations of hyperparameter #4

Closed ThisIsIsaac closed 5 years ago

ThisIsIsaac commented 5 years ago

Could you explain how you chose the values for:

  1. num_step
  2. phase_step

for the single-agent Q-based RLs?

xiawenwen49 commented 5 years ago
  1. num_step is based on the traffic flow data and the CityFlow simulator, (e.g. if we have few traffic flow, then the num_step should be small correspondingly)
  2. phase_step means the consistency time of one signal phase, so it’s determined by experience (e.g. 15s or 30s) Thank you for your questions.
ThisIsIsaac commented 5 years ago
  1. num_step is based on the traffic flow data and the CityFlow simulator, (e.g. if we have few traffic flow, then the num_step should be small correspondingly)

So I assume changing num_step didn't have much impact on the performance of agents.

  1. phase_step means the consistency time of one signal phase, so it’s determined by experience (e.g. 15s or 30s)

Oh so phase_step is how long a single signal phase takes?

xiawenwen49 commented 5 years ago

1.Yes, in practice we adopt the dynamic num_step (i.e. we end one episode if one lane has congestion, using a threshold) 2.Yes.