gazebosim / gz-sim

Open source robotics simulator. The latest version of Gazebo.
https://gazebosim.org
Apache License 2.0
672 stars 262 forks source link

Optical tactile plugin demo world segfaults on start #2118

Open jrutgeer opened 1 year ago

jrutgeer commented 1 year ago

Environment

$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

Description

Run the 'optical tactile plugin' demo world:

  $ gz sim optical_tactile_sensor_plugin.sdf 

Start the simulation.

Instant crash.

#6    Object "/opt/gazebo_garden/install/lib/libgz-sim7.so.7", at 0x7f0a37428d62, in gz::sim::v7::SimulationRunner::Run(unsigned long)
#5    Object "/opt/gazebo_garden/install/lib/libgz-sim7.so.7", at 0x7f0a37428600, in gz::sim::v7::SimulationRunner::Step(gz::sim::v7::UpdateInfo const&)
#4    Object "/opt/gazebo_garden/install/lib/libgz-sim7.so.7", at 0x7f0a3741f511, in gz::sim::v7::SimulationRunner::UpdateSystems()
#3    Object "/opt/gazebo_garden/install/lib/gz-sim-7/plugins/libgz-sim-opticaltactileplugin-system.so", at 0x7f0936257307, in gz::sim::v7::systems::OpticalTactilePlugin::PreUpdate(gz::sim::v7::UpdateInfo const&, gz::sim::v7::EntityComponentManager&)
#2    Object "/opt/gazebo_garden/install/lib/gz-sim-7/plugins/libgz-sim-opticaltactileplugin-system.so", at 0x7f0936256912, in gz::sim::v7::systems::OpticalTactilePluginPrivate::Load(gz::sim::v7::EntityComponentManager const&)
#1    Object "/opt/gazebo_garden/install/lib/libsdformat13.so.13", at 0x7f0a36eb899c, in sdf::v13::Element::HasElement(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const
#0    Object "/opt/gazebo_garden/install/lib/libsdformat13.so.13", at 0x7f0a36eb88b1, in sdf::v13::Element::GetElementImpl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const
Segmentation fault (Address not mapped to object [0x18])

I checked the sdf for missing or empty fields (e.g. an empty string such as <namespace></namespace> but I don't see anything incorrect.

I also removed the build and install directory and recompiled, but that did also not resolve the issue.

azeey commented 12 months ago

I'm able to reproduce this, but not if I just run the server (gz sim -s -r optical_tactile_sensor_plugin.sdf). It seems to be crashing when accessing depthCameraSdf.Element() in https://github.com/gazebosim/gz-sim/blob/e007fa2c0b2e795ab1b74c75e48962b8dfacedba/src/systems/optical_tactile_plugin/OpticalTactilePlugin.cc#L561 but it's not clear to me how running the GUI causes the Element() function to return nullptr.

jrutgeer commented 11 months ago

I noticed that it is not related to 'server only' but rather to the -r flag:

This works:

 $ gz sim -r optical_tactile_sensor_plugin.sdf 

But without -r flag, it crashes upon click of the start icon.

azeey commented 11 months ago

From what I've understood so far, the Element() returns nullptr because DepthCamera has undergone a roundtrip serialization/deserialization process. This process is not able to retain the original ElementPtr associated with the DepthCamera SDFormat object because deserialization uses an SDFormat DOM object constructed programatically (i.e, not loaded from a file or string). The whole process is triggered when the play button is pressed: https://github.com/gazebosim/gz-sim/blob/f55212eb271a2f2b4d726da45e3aaeebede3b1a2/src/gui/GuiRunner.cc#L181-L198.

I think the best solution is to remove any use of the ElementPtr of the DepthCamera component. It's only used for checking if certain SDFormat parameters have been manually set by the user. The actual values of the parameters can be obtained without using the ElementPtr, so we won't be sacrificing functionality, but we'll lose nice error messages.