LiangZhang1996 / DataLight-old

code for "Data Might be Enough: Bridge Real-World Traffic Signal Control Using Offline Reinforcement Learning"
GNU General Public License v3.0
10 stars 1 forks source link

why action_dim is set to 4 in model architecture #4

Closed sallyqiansun closed 1 year ago

sallyqiansun commented 1 year ago

Hi, sorry to bother you again, when trying to read your code in more details, I realized you used 4 as the action dimension to predict q, as well as in fixedtime and random agents, since the jinan and hangzhou roadnets both have 8 phases, could I know why you chose 4 instead of 8? Did I understand the roadnet configuration wrongly? Thank you in advance!

LiangZhang1996 commented 1 year ago

I used 4 phases rather than 8 phases for the following reasons:

  1. 4 phases configuration is the most commonly used in the real world
  2. Compared to 8 phases, 4 phases is more reasonable, 8 phases waste many resources
  3. 4 phases is also mostly used in recent articles
  4. We can also run 8-phase configurations if needed
sallyqiansun commented 1 year ago

Thanks, does this mean that you changed the roadnet data to include only four phases in total? The existing files seem to have eight, so further processing is needed?

for example, I extracted one lightphase from roadnet_4x4.json, there seems to be 8 phases with time=30: "lightphases": [ { "time": 5, "availableRoadLinks": [ 10, 2, 3, 6 ] }, { "time": 30, "availableRoadLinks": [ 0, 2, 3, 6, 7, 10 ] }, { "time": 30, "availableRoadLinks": [ 2, 3, 4, 6, 10, 11 ] }, { "time": 30, "availableRoadLinks": [ 1, 2, 3, 6, 8, 10 ] }, { "time": 30, "availableRoadLinks": [ 2, 3, 5, 6, 9, 10 ] }, { "time": 30, "availableRoadLinks": [ 0, 1, 2, 3, 6, 10 ] }, { "time": 30, "availableRoadLinks": [ 2, 3, 6, 7, 8, 10 ] }, { "time": 30, "availableRoadLinks": [ 2, 3, 4, 5, 6, 10 ] }, { "time": 30, "availableRoadLinks": [ 2, 3, 6, 9, 10, 11 ] }

LiangZhang1996 commented 1 year ago

There is no need to change the roadnet data, because in the light phases configurations:

  1. 1-st phase represents the red light
  2. 2-nd~5th phase represents the 4-phase configuration
  3. 6th~9th represents the remainder of 4 phases of the total 8 phases.

So, If you want to use the 4-phase configuration, use the 1-st~5-th phase; If you want to use the 8-phase configuration, use all the phases.

That means, for the 4-phase configuration the action space is <1,2,3,4>, while for the 8-phase design, the action space is <1,2,3,4,5,6,7,8>.

When the action space is <1,2,3,4>, the 8-phase configuration is not influenced (because they cannot be activated), so we do not need to change the roadnet file.

LiangZhang1996 commented 1 year ago

Besides, the time=30 will not work when we use RL to control the traffic, and they will run as a cycle when no RL agents.

sallyqiansun commented 1 year ago

Get it, thank you for the explanation. Have you tried to compare 4-act scenarios and 8-act scenarios, would the choice between 4 and 8 impact the final results in terms of travel time, queue length, pressure, etc,?

LiangZhang1996 commented 1 year ago

In our previous articles, we have discusses the difference of final results with different phase configurations. You can refer to https://arxiv.org/pdf/2201.00006.pdf and https://arxiv.org/pdf/2112.02336.pdf.

sallyqiansun commented 1 year ago

Thank you.