Optimal Path (Immitation Learning)

kojimano commented 6 years ago

Hi It looks like the agent will calculate the shortest path to the goal each time it performs the action but I got slightly confused with its calculation. (I am hoping I can use the shortest path for the purpose of imitation learning) Every time the agents perform the action like "TurnLeft" or "Turn Right", the direction of optimal path changes (I am not sure why this is happening since the agent's location is not changing as far as I am observing). Since the orientation of the agent is also changing, this leads the agent never able to direct itself to optimal path direction.

msavva commented 6 years ago

Hi @kojimano ,

The shortest path direction is given in the agent's frame of reference, and so it will indeed change on turn actions as well (since the agent's orientation has changed on a turn).

This shortest path computation is not necessarily the "optimal path" for the agent's action space because it is computed on a discretized grid given the agent's current position and the goal position. The relative orientation of the agent and the direction to the next grid cell on the path is then used to compute the direction of the shortest path.

If you are observing a specific issue please send us a description in a bit more detail.

kojimano commented 6 years ago

Thanks! Could you probably point me to the script where the shortest path computation has been done (I couldn't trace back where.)

Now I understand the agent's orientation is given in an absolute frame, and shortest path is given in an agent's frame.

It does make sense that the shortest path computed from the grid world is not necessarily an optimal path for the agent's action space (since the agent can go to any of 50 directions.) However, my question is wether the agent can reach the goal simply by following the path computed. My biggest problem is that, when the agent tries to follow the path which is indicated by the orientation, sometimes it will collide the object. May I ask are there any guarantees that the agent can reach the next grid without colliding the objects only with single forward and mulltiple TurnLeft or TurnRight operations? (It seems like this is not always the case, with slightly off location of the agnet from the center of grid and the size of agent.)

Essentially, my goal here is to come up with a good heuristics of what is optimal (or near optimal) next action to take (either TurnLeft, TurnRight or Forward), given the current location/orientation (also considering the lbstacles as well).

Thank you for your response in advance.

kojimano commented 6 years ago

Also could you verify my understanding regarding these measurements is correct?

   observations['observation']['measurements']['shortest_path_to_goal']['direction']

   observations['observation']['measurements']['direction_to_goal']

   observations['info']['agent_state']['orientation']

All of the measurments returns the tuplet of containing 3 numbers (corresponding to x,y,z in absoluute and relative coordinate). For the first two measuments we want to make x coordinate close to 1 so that the agent direct themselves to a next grids or goal. For the third measurments the retuned value is more coordinate on absolute axis. This confuzes me slightly after I do numerous expeiments trying to figure out.

angelxuanchang commented 6 years ago

The shortest path is computed as part of the update step here: https://github.com/smartscenes/sstk/blob/660cc7b17ccc2745b3f0453083405adecac03e7a/client/js/lib/nav/NavScene.js#L742

For continuous control, the agent step size may also be larger than a grid cell (depending on the agent velocity) - this is another way where the discretized shortest path may not match the optimal trajectory for the agent.

As for the agent coordinate frame: +z is forward so you want to get the z coordinate close to 1 so the agent is oriented correctly. +y is the upward direction and +x right direction. This is true for the observations['observation']['measurements']['direction_to_goal']

I think you are right that the values for observations['observation']['measurements']['shortest_path_to_goal']['direction'] are not quite following that convention and we are debugging more to see what the problem is.

You can check the measurements by debugging using python3 -m tools.pygame_client -s map --navmap

(the navmap option was accidentally removed at some point, so please pull to use the --navmap option)

kojimano commented 6 years ago

Thank you for your response. I will take a look at this issue closer. It does seem like

observations['observation']['measurements']['direction_to_goal']

a quite random in the sense the direction sometimes macthes (or opposites) from optimal path given by a human eye.

kojimano commented 6 years ago

Since the python3 -m tools.pygame_client -s map --navmap shows the correct path but just an angle for the next grid is not very precise. It obviously seems like an error is a function getting a relative orientation. While this seems a bit tricky to fix (as I don't know the much of the code structure), if this will take sometimes to fix, I would also try fixing by myself given the time constraints. So it is really appreciated if you could let me know the approximate date this can get debugged.

Thank you for all your work trying to fix this issue.

kojimano commented 6 years ago

I did temporal fix by myself bt returning the optimal path instead of angles and etc, so I will close this issue.

minosworld / minos

Optimal Path (Immitation Learning) #19