Actions are not equivalent to Duckiebot commands

jzilly commented 6 years ago

Currently the gym implements actions as (velocities, rotational speed omega) and hides the inverse kinematics to get wheel velocities/commands from the user. All real logs however measure commands and not velocities which makes the simulator unrealistic.

To make the simulator more compatible, I would propose to make the small change to have velocity commands to the wheels as actions. In "envs/duckietown_env.py" the "step" function could be adapted.

maximecb commented 6 years ago

So, the simulator uses differential drive commands (which I assume is what you mean) internally. There is a wrapper on top to take velocity+omega as input. I personally prefer the differential drive mode, but @liampaull has said that velocity+omega was the way to go for the competition. I can remove the velocity+omega wrapper, and wouldn't mind doing so, it's just that we kind of need to agree on what input the simulator and duckiebot should take.

fgolemo commented 6 years ago

Using differential drive mode, i.e. driving without IK is unfair to people not using the ROS stack. Because the real duckiebots have Dynamics that are captured in the robot calibration files and not in the simulator. Therefore we agreed that it's more unifying to just apply the IK of each known robot before handing over the control to the agent. That and also real cars steer with vel+omega.

jzilly commented 6 years ago

Not sure I follow. The real Duckiebots have access only to commands not to true vel+omega. In the simulator, the Duckiebot can work with either wheel commands or vel+omega. There is no limitation that I currently see. The simulator is quite flexible. The calibration file of a robot is assumed to be known in the simulator and saved through some parameters in the environment. That is what the inverse kinematics in function "step" (https://github.com/duckietown/gym-duckietown/blob/master/gym_duckietown/envs/duckietown_env.py) is doing. Additionally the calibration uncertainty is actually an important part of training that does not need to be hidden in the agent. The way I ran into this was that during imitation learning from logs, the actual input and output points of the pipeline are images as inputs and commands as outputs. There are ways of shortening that pipeline to vel+omega but that adds a module that may or may not be used. The goal for me is to make sure that the simulator matches the reality of Duckiebots. For Duckiebots we cannot assume to always have a topic for velocity+omega but we certainly always will have commands for any meaningful activity.

maximecb commented 6 years ago

So, what is it exactly that you would change in the simulator?

jzilly commented 6 years ago

Good question. All I would do is move the inverse kinematics in the "step" function https://github.com/duckietown/gym-duckietown/blob/master/gym_duckietown/envs/duckietown_env.py to a separate function outside of step, except perhaps the maximum-minimum limitations of the commands. The existing code using "vel-omega" would run the exact same way, with the difference that this new inverse kinematics function would be called before passing the actions to "step". That way anyone who wants to use this function can but is not required to do so.

maximecb commented 6 years ago

So, what you want is to have a "mode", for the simulator to accept both kinds of actions.

jzilly commented 6 years ago

I guess yes and no. The actions ideally would be commands to the wheels. However we would provide a parameterized (with self.k, baseline, etc.) function that can transform vel-omega to those wheel commands. If one wants to learn a pipeline from image to vel-omega one would then also call this "inverse-kinematics" function to map this to wheel commands.

maximecb commented 6 years ago

I used to have the conversion to velocity+omega as an OpenAI Gym wrapper. We could go back to that design.

jzilly commented 6 years ago

Sounds like an elegant solution to me.

AndreaCensi commented 5 years ago

Related to #90

maximecb commented 5 years ago

@bhairavmehta95 for this one you should revert back to having the conversion as a Gym wrapper instead of a derived class. It's a fairly small change.

bhairavmehta95 commented 5 years ago

Yeah - just trying to get something working with the current setup, as soon as I have a demo video working I will change it (should be < 2 days)

AndreaCensi commented 5 years ago

Don't we have this wrapper already in DuckietownEnv?

bhairavmehta95 commented 5 years ago

It was - I am in the process of adding it back, but its requiring quite a few changes since a lot of the docs / code samples assume a [velocity, heading] action. I've sent the updates to Julian and Panos, so this shouldn't be a roadblock. I just want to make sure all of our old code works with the new wrapper before I make this change.

AndreaCensi commented 5 years ago

@takeitallsource can you confirm this? I thought we were already using the DuckietownEng

manfreddiaz commented 5 years ago

We are using DuckietownEnv right now.

bhairavmehta95 commented 5 years ago

@PanagiotisBountouris Whenever your experiments finish, let me know if the changes we discussed work okay. If they do, I will update the entire repository.

AndreaCensi commented 5 years ago

We are now using Simulator as of yesterday.

duckietown / gym-duckietown

Actions are not equivalent to Duckiebot commands #81