gazebosim / gz-rendering

C++ library designed to provide an abstraction for different rendering engines. It offers unified APIs for creating 3D graphics applications.
https://gazebosim.org
Apache License 2.0
50 stars 48 forks source link

`INTEGRATION_camera` and `INTEGRATION_thermal_camera` tests segfault on MacOS, Fortress #654

Open Blast545 opened 2 years ago

Blast545 commented 2 years ago

Environment

``` 97: [----------] 8 tests from ThermalCamera/ThermalCameraTest 97: [ RUN ] ThermalCamera/ThermalCameraTest.ThermalCameraBoxesUniformTemp/ogre 97/102 Test #97: INTEGRATION_thermal_camera ..............***Exception: SegFault 0.35 sec test 98 Start 98: check_INTEGRATION_thermal_camera ... 85: [ RUN ] Camera/CameraTest.VisualAt/optix 85: ^[[1;36m[Dbg] [camera.cc:209] ^[[0m^[[1;36mVisualAt not supported yet in rendering engine: ^[[0m^[[1;36moptix^[[0m^[[1;36m^[[0m 85: [ OK ] Camera/CameraTest.VisualAt/optix (0 ms) 85: [ RUN ] Camera/CameraTest.ShaderSelection/ogre 85/102 Test #85: INTEGRATION_camera ......................***Exception: SegFault 1.22 sec ```

Description

Steps to reproduce

  1. Run a rendering build on the buildfarm here: https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering6-homebrew-amd64/
  2. See it failing in these two tests.

Output

Reference failures: First time appearing in the buildfarm: https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering6-homebrew-amd64/50/ Most recent: https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering6-homebrew-amd64/59/

This is totally related to the changes introduced with #617 and #623, can I ask you to take a look? @iche033

I think this test regression that is appearing on gz-sensors it's also related to this issue: https://build.osrfoundation.org/job/ignition_sensors-ci-ign-sensors6-homebrew-amd64/23/console

darksylinc commented 2 years ago

Hunch: The fixes that are claimed to introduce the crash merely fixed improper shutdown (i.e. memory leaks) in gz-rendering.

A common cause of bugs is having live shared_ptrs when its manager got shutdown, like the following example:

Ogre::MaterialPtr material = ...;
Ogre::Root::getSingleton().shutdown( ... );
material.reset(); // Ooops

This can also happen if the MaterialPtr (or any other shared_ptr) is part of a class:

class Foo
{
   Ogre::MaterialPtr material;
};

void myFunction()
{
   Foo foo;
   foo.material = ...;
   Ogre::Root::getSingleton().shutdown( ... );
    // Ooops at the end of myFunction foo.~Foo() will be called which calls material.reset();
}

This may be silently working fine in one RenderSystem but happens to cause trouble with Metal RenderSystem (or perhaps Metal is shutdown too early).

A callstack would help diagnosing the crash cause.