gazebosim / gz-sim

Open source robotics simulator. The latest version of Gazebo.
https://gazebosim.org
Apache License 2.0
669 stars 262 forks source link

Simulation Reset API #1107

Open mjcarroll opened 2 years ago

mjcarroll commented 2 years ago

Desired behavior

Add the ability for the state of simulation to be reset to what it was at t=0. Reset can be called multiple times on the same simulation. A simulation can be stepped or otherwise run after being reset.

This feature should provide a relatively-optimized way of resetting simulation to a particular state without incurring unnecessary overhead of reloading and re-parsing assets.

Alternatives considered

  1. Fully stopping and starting a running simulation executable.
    • This has the highest overhead, as all assets, plugins, libraries and executables must be unloaded and reloaded with a completely new executable.

Implementation Outline

The implementation should happen in the following general phases:

  1. The ECM and SimulationRunner are given an appropriate API to surface the functionality from both C++ and ign-transport
  2. Systems are given an additional API that allows system writers to implement optimized Reset functionality.
  3. Systems included with ign-gazebo are updated to take advantage of the new Reset interface for systems.

The Reset interface shall be opt-in and not break any existing System plugin implementations. In the case that the Reset interface isn't implemented, the entire system should be destructed and constructed during the Reset phase.

chapulina commented 2 years ago
  1. The ECM and SimulationRunner are given an appropriate API to surface the functionality from both C++ and ign-transport

Is there something missing from the transport interface that is already there? See https://github.com/ignitionrobotics/ign-gazebo/issues/203#issuecomment-642138544.

  1. Systems are given an additional API that allows system writers to implement optimized Reset functionality.

As mentioned on https://github.com/ignitionrobotics/ign-gazebo/issues/203#issuecomment-642138544, systems can already detect resets from UpdateInfo. If that's not enough, what's missing?

  1. Systems included with ign-gazebo are updated to take advantage of the new Reset interface for systems.

+1

In the case that the Reset interface isn't implemented, the entire system should be destructed and constructed during the Reset phase.

That's an interesting idea. As a reference, Gazebo classic offers an optional Reset callback and if it isn't implemented, it's simply not called, and nothing else happens. I wonder if forcebly resetting plugins may have unintended consequences, but I can't think of any right now.

scpeters commented 2 years ago

the most common issue I've seen with osrf/gazebo plugins is storing the most recent time that simulation was run in order to throttle its update rate, then failing to reset that along with the rest of simulation. This can result in plugins failing to update until simulation time reaches the point from which it had previously been reset.

mjcarroll commented 2 years ago

Is there something missing from the transport interface that is already there? See #203 (comment).

No, this was just for completeness in documenting the feature. I think that transport is already good to go, but as with @diegoferigo's comment in #203, it makes sense to also have this as part of the C++ API. I would envision this propagating through SimulationRunner up to Server so that it is accessible there.

As mentioned on #203 (comment), systems can already detect resets from UpdateInfo. If that's not enough, what's missing?

I have two primary concerns with relying entirely on UpdateInfo.

  1. That not many people are aware of this or handling it. It's true in many of our internal system plugins, and I'm not sure of what the landscape looks like for plugins that are outside of this repository. I don't think that this is anyone's fault, it just hasn't been considered in several of the implementations.
  2. I'm concerned with a lot of duplicated logic between PreUpdate, Update, or PostUpdate. For the sake of completeness, you would need to detect a potential reset in all of those methods if you implement all three. It makes each of the functions larger as well.

I think adding an interface that system developers could opt-in to would be a safer approach. In terms of (1) it would make it so that there won't be any behavior change if no additional code is added. In the case of (2), it provides an explicit method that will be called in the case of a Reset. It should be very clear to a developer what the expectations are.

I wonder if forcebly resetting plugins may have unintended consequences, but I can't think of any right now.

I couldn't think of any. In an ideal world, those plugins wouldn't have any internal state (instead storing it in the ECM). In this case, though, I think it's the cleanest way to guarantee that the internal state is reset to some sane initial condition.

chapulina commented 2 years ago

it makes sense to also have this as part of the C++ API

+1

For the sake of completeness, you would need to detect a potential reset in all of those methods if you implement all three

Does that complexity go away if you add a Reset callback though? My concern is that the Reset callback would be adding to the complexity by adding yet another way of reacting to a reset.

Looking at it from another angle, we currently give developers the flexibility of reacting to a reset at any point in the update loop. A system that updates their internal state in PostUpdate may choose to react to resets there and ignore resets on PreUpdate.

not many people are aware of this or handling it

Is your point that the Reset callback would be easier to discover than changes to UpdateInfo?


Assuming we're going with a reset callback, I have some follow up questions:

azeey commented 2 years ago

We currently process Reset requests via ign-transport after the *Update cycle. This means all systems get the updated UpdateInfo at their next PreUpdate. I think we should keep this scheme if we add a Reset API. I think this would keep things deterministic. Otherwise, if we do it asynchronously, we may end up having the reset happen after some systems have finished part of their *Update cycle and so would fail to reset their state properly.

For me the argument for a Reset interface are:

  1. As @mjcarroll suggested, a Reset interface is a lot more discoverable for developers than the logic for detecting a reset. And probably less error prone. For example, it's not enough just to check simTime == 0 because by default simulation starts in a paused state and simTime is 0 until the user hits the play button.
  2. If we allow a system to make changes to the ECM in its Reset function and if its the first system in the list, this can be used as a means of resetting the world with some variations. This can be really useful for generating test scenes with small changes between test runs.
  3. It would give us a clear way to know if a system supports Reset. If it doesn't we can either warn the user or reload the plugin.

The downside is that we're making a distinction between Reset and Rewind/Time seek. If we ever get to a point where we can go back in time and start the simulation from any arbitrary time instance, then having a Reset function would be redundant.

chapulina commented 2 years ago

The downside is that we're making a distinction between Reset and Rewind/Time seek.

Ah this was a question that's been in my mind, whether the Reset API would leave room for implementing simulation rewind in the future. I personally like using UpdateInfo because it's flexible enough to support all these use cases.


Another question that's been in my mind is whether you're planning to make the distinction between "time" and "model" resets like Gazebo classic does

Kakcalu13 commented 2 years ago

Coming from #1489,

My issue feels somewhat duplicated yet it focus on entitymanager mostly.

I'm not sure about UpdateInfo since I couldn't find the documentation for it.

From what I see, reset is highly possible without the sensor plugins while time is still running.

See here,

https://user-images.githubusercontent.com/65916520/169037256-5ca43ed4-c9a5-49fa-b181-23f14e280b75.mp4

This was due to no sensor involves. I can do it all day. The problem is that once you add the sensor in the robot, it automatically adds camera RGB or any sensor that comes first (in this video, camera RGB is the first on the sdf). Once you remove the robot, the sensor is still stored in entity manager. I don't know if UpdateInfo can do this but I'm pretty sure reset can remove the registered RGB camera on the entity manager.

See here when you remove the robot, it spammed the error.

``` [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [63], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [64], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [65], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [67], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [68], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [62], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [63], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [64], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [65], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [67], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [68], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [62], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [63], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [64], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [65], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [67], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [68], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [62], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [63], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [64], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [65], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [67], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [68], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [62], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [63], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [64], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [65], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [67], but this entity does not exist. This create component request will be ignored. [Err] [EntityComponentManager.cc:486] Trying to create a component of type [9436740692853731264] attached to entity [68], but this entity does not exist. This create component request will be ignored. ```

This is because it stores the previous RGB camera from the previous robot. So when I load the second robot, it crash.

So the question is, which would solve to reset, respawn and possibly UpdateInfo?, why is it still there when the parent of plugin is not in the world? Shouldn't it be disappear by the time the model isn't in the world?

mjcarroll commented 2 years ago

@Kakcalu13 There is a follow up to the first two reset PRs: https://github.com/gazebosim/gz-sim/pull/1355

This adds the reset capability to sensors and the scenebroadcaster, and may do what you are looking for?

Kakcalu13 commented 2 years ago

Oh my! God bless you! Let me try that one!