Explanations of hyperparameter

multi-commander / Multi-Commander

Multi & Single Agent Reinforcement Learning for Traffic Signal Control Problem

Apache License 2.0

116 stars 30 forks source link

Explanations of hyperparameter #4

Closed ThisIsIsaac closed 5 years ago

ThisIsIsaac commented 5 years ago

Could you explain how you chose the values for:

num_step
phase_step

for the single-agent Q-based RLs?

xiawenwen49 commented 5 years ago

num_step is based on the traffic flow data and the CityFlow simulator, (e.g. if we have few traffic flow, then the num_step should be small correspondingly）
phase_step means the consistency time of one signal phase, so it’s determined by experience (e.g. 15s or 30s) Thank you for your questions.

ThisIsIsaac commented 5 years ago

num_step is based on the traffic flow data and the CityFlow simulator, (e.g. if we have few traffic flow, then the num_step should be small correspondingly）

So I assume changing num_step didn't have much impact on the performance of agents.

phase_step means the consistency time of one signal phase, so it’s determined by experience (e.g. 15s or 30s)

Oh so phase_step is how long a single signal phase takes?

xiawenwen49 commented 5 years ago

1.Yes, in practice we adopt the dynamic num_step (i.e. we end one episode if one lane has congestion, using a threshold) 2.Yes.