Open osrf-migration opened 4 years ago
I believe we can add a lock-step mode for local testing and development.
In the way that the physics and sensor measurements are generated in ignition-gazebo, we are already more "lock-step" than the previous versions of Gazebo9 and Gazebo11. Physics (and the rest of simulation) may be halted to guarantee that sensor data is generated from the correct physical state in the world.
The issue is once we have ROS involved. Currently, the bridge converts the ignition messages to ROS messages, but is not blocking any of the simulation waiting for a response. I believe that the current best approach to accomplish a lock-step with a controller would be to either:
Do you think either of these routes would be sufficient for your testing/analysis?
Original report (archived issue) by Zbyněk Winkler (Bitbucket: Zbyněk Winkler (robotika)).
Our controller is strictly deterministic. When we have our binary log file we can replay what it did and get the exact same results (verified against the log). However the simulation environment is not deterministic which adds yet another challenge. The amount of non-determinism is significant to the point that it affects the final score. Take the final leaderboard for Tunnel Circuit from here: https://www.darpa.mil/news-events/2019-10-30. Only on worlds B and E we got the expected score distribution.
One steps towards being able to create and run deterministic tests would be the creation of something called lockstep mode where the simulation and the controller wait for each other. The PX4 team implemented this for gazebo 9 and made it the default for their tests. There are many advantages to this:
My proposal is to implement this at least for local runs to ease development. Even better would be to implement this for the cloud simulation as well. This way we could side step the troubles with the speed of the AWS machines and make those runs deterministic as well.