Open FeignDeath opened 5 days ago
Thanks for bringing this up. I want to make sure I understand this correctly. You're saying two trains that are moving in opposite directions, correct? Basically that the starting station of one is the ending station of another? I also interpreted this as two trains starting at the same place and sharing the same optimal path to the same station. In such a case, the trains could simply follow one another with no issue.
But could you help me understand this situation a bit more clearly?
Because in this case—unless I've counted incorrectly—it looks like the alternative path is exactly 35 steps. Are you saying this is technically unsatisfiable because the time required to spawn and start moving actually means it will take longer than the 35 steps shown in the image?
In any case, I know we have an ASP encoding for reachability, but that's not necessarily the problem here (because all cells in Flatland are reachable from any cell). I wonder whether it could be adapted to be reachable within a certain number of time steps. Still, I'd like to know how clearly we could define what that number of time steps is.
Okay. I picked an encoding which is closer to solvable than I hoped and my explanation was a little bit wrong.
Lets look at the .lp in more detail: train(0) is bottom rigt, while train(1) is top left
train(0). start(0,(24,13),0,n). end(0,(6,9),35).
train(1). start(1,(6,8),1,s). end(1,(24,12),36).
Due to flatland dissallowing actions at 0, this correlates to both accepting actions from 1 onwards. Also as you can see the spawn positions are next to the goals not on the goals. Trains in flatland are never spawned on stations. In this case they are spawned one cell next to the station (so on the parallel rail, which is also one of the middle two of the four parallel rails) (so if we are talking about the train in the top left, it spawns one cell to the right of the station).
This position change leads to the alternatives having a length of 36 cells instead of 35.
Now I don't really know how flatland evaluates the arrival times. I would guess if a plan solves the problem via having train(1) take the roundabout, it would need two additional timesteps. My assumption would be shown by the formula below arrival_time = path_length + earliest_departure_time + spawn_action 38 = 36 + 1 + 1 Thus, requiring two additional timesteps.
This assumption is based on what the visualizer shows. Below I give a small example what the visualizer would show on a very small map, which just consists of start, a cell in between and the goal. The train has a earliest departure value of 0 (so practically 1). | Timestep | Position | State | Action |
---|---|---|---|---|
1 | - | - | move_forward | |
2 | start | waiting | - | |
3 | start | moving | - | |
4 | (next cell) actually despawned as it would reach the goal in the next timestep | - | - | |
5 | (goal) despawned | - | - |
So in this example it reaches the goal at timestep 4, with a path_length of 2 and a earliest_departure_time of practically 1. This checks out with my formula. 4 = 2 + 1 + 1
What I actually wanted to show is that not all encodings are perfectly solvable and that is due to the generation of the latest arrivals and the nature of flatland. The latest arrival times are generated by the timetable_generator in flatland/envs/timetable_generators.py. Summarized the formula for generating latest_arrival_times is:
latest_arrival_time = earliest_departure + shortest_path * travel_buffer_multiplier + mean_shortest_path * mean_shortest_path_multiplier
replacing the constants one gets:
latest_arrival_time = earliest_departure + shortest_path * 1.3 + mean_shortest_path * 0.2
To check it for train(0) we get:
latest_arrival_time = 0 + 23*1.3 + ((23+23)/2)*0.2
latest_arrival_time = 0 + 30 + 5
They seem to take spawning as a part of the path. Thus the shortest paths are both 23 steps long. I also removed the ceil operation and int parsing to make it more readable. Under these assumptions it fits with the actual arrival times.
This illustrates the problem with the personal horizons. The arrival times do not change with the number of agents. Thus, an increase in agents makes more instances unsolvable, while an increase in map size mitigates the effect.
It might be a solution to use the global horizon from flatland instead, which is calculated in the timetable_generator as well and returned as max_episode_steps. This seems to factor in the number of agents, but I did not yet understand how it works.
Latest arrival times are definitely designed to be broken and to then reduce the violation thereoff.
Max episodes steps, might be a fix (though I remember some problems in the back of my head with this approach as well, but I can't prove it right now). To that end, it would need to be passed to the encoding as a predicate.
I did some testing and want to present an unsolvable instance.
The following instance requires both agents to reach the other station in 35 timesteps, the optimal path for both is 23 steps. Both shortest paths share the red line. Waiting for the other agent to pass the red line first takes 15-17 additional timesteps. All other plans would add even more timesteps.
Since I can't add the files directly, here is a link to all files for the environment: https://my.hidrive.com/share/9c71tbibj8
This illustrates that especially personal horizons are not always satisfiable. There are different options to handle this. If there is an efficient encoding to prove solvability (not that I know) one could filter the instances at generation. One could also have encodings which accept a variable which extends the personal horizons that could be given to the encodings, which would be run again on a failure with bigger personal horizons. Lastly, one could just make a note of this problem and ignore it (most practical as I see it).