Closed jzilly closed 5 years ago
So, the simulator uses differential drive commands (which I assume is what you mean) internally. There is a wrapper on top to take velocity+omega as input. I personally prefer the differential drive mode, but @liampaull has said that velocity+omega was the way to go for the competition. I can remove the velocity+omega wrapper, and wouldn't mind doing so, it's just that we kind of need to agree on what input the simulator and duckiebot should take.
Using differential drive mode, i.e. driving without IK is unfair to people not using the ROS stack. Because the real duckiebots have Dynamics that are captured in the robot calibration files and not in the simulator. Therefore we agreed that it's more unifying to just apply the IK of each known robot before handing over the control to the agent. That and also real cars steer with vel+omega.
Not sure I follow. The real Duckiebots have access only to commands not to true vel+omega. In the simulator, the Duckiebot can work with either wheel commands or vel+omega. There is no limitation that I currently see. The simulator is quite flexible. The calibration file of a robot is assumed to be known in the simulator and saved through some parameters in the environment. That is what the inverse kinematics in function "step" (https://github.com/duckietown/gym-duckietown/blob/master/gym_duckietown/envs/duckietown_env.py) is doing. Additionally the calibration uncertainty is actually an important part of training that does not need to be hidden in the agent. The way I ran into this was that during imitation learning from logs, the actual input and output points of the pipeline are images as inputs and commands as outputs. There are ways of shortening that pipeline to vel+omega but that adds a module that may or may not be used. The goal for me is to make sure that the simulator matches the reality of Duckiebots. For Duckiebots we cannot assume to always have a topic for velocity+omega but we certainly always will have commands for any meaningful activity.
So, what is it exactly that you would change in the simulator?
Good question. All I would do is move the inverse kinematics in the "step" function https://github.com/duckietown/gym-duckietown/blob/master/gym_duckietown/envs/duckietown_env.py to a separate function outside of step, except perhaps the maximum-minimum limitations of the commands. The existing code using "vel-omega" would run the exact same way, with the difference that this new inverse kinematics function would be called before passing the actions to "step". That way anyone who wants to use this function can but is not required to do so.
So, what you want is to have a "mode", for the simulator to accept both kinds of actions.
I guess yes and no. The actions ideally would be commands to the wheels. However we would provide a parameterized (with self.k, baseline, etc.) function that can transform vel-omega to those wheel commands. If one wants to learn a pipeline from image to vel-omega one would then also call this "inverse-kinematics" function to map this to wheel commands.
I used to have the conversion to velocity+omega as an OpenAI Gym wrapper. We could go back to that design.
Sounds like an elegant solution to me.
Related to #90
@bhairavmehta95 for this one you should revert back to having the conversion as a Gym wrapper instead of a derived class. It's a fairly small change.
Yeah - just trying to get something working with the current setup, as soon as I have a demo video working I will change it (should be < 2 days)
Don't we have this wrapper already in DuckietownEnv?
It was - I am in the process of adding it back, but its requiring quite a few changes since a lot of the docs / code samples assume a [velocity, heading] action. I've sent the updates to Julian and Panos, so this shouldn't be a roadblock. I just want to make sure all of our old code works with the new wrapper before I make this change.
@takeitallsource can you confirm this? I thought we were already using the DuckietownEng
We are using DuckietownEnv right now.
@PanagiotisBountouris Whenever your experiments finish, let me know if the changes we discussed work okay. If they do, I will update the entire repository.
We are now using Simulator as of yesterday.
Currently the gym implements actions as (velocities, rotational speed omega) and hides the inverse kinematics to get wheel velocities/commands from the user. All real logs however measure commands and not velocities which makes the simulator unrealistic.
To make the simulator more compatible, I would propose to make the small change to have velocity commands to the wheels as actions. In "envs/duckietown_env.py" the "step" function could be adapted.