gazebosim / gz-rendering

C++ library designed to provide an abstraction for different rendering engines. It offers unified APIs for creating 3D graphics applications.
https://gazebosim.org
Apache License 2.0
56 stars 51 forks source link

Camera/CameraTest.VisualAt/ogre2 flaky test on Bionic/amd64 #170

Open j-rivero opened 3 years ago

j-rivero commented 3 years ago

I've seen the Camera/CameraTest.VisualAt/ogre2 failing in some builds for Bionic/amd64. Changes listed in build 20seems to be the starting point of the failures.

54: [ RUN      ] Camera/CameraTest.VisualAt/ogre2
54: [Msg] Loading plugin [ignition-rendering-ogre2]
54: /var/lib/jenkins/workspace/ignition_rendering-ci-ign-rendering3-bionic-amd64/ign-rendering/test/integration/camera.cc:243: Failure
54: Expected: (nullptr) != (vis), actual: (nullptr) vs 16-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>
54: [  FAILED  ] Camera/CameraTest.VisualAt/ogre2, where GetParam() = "ogre2" (224 ms)
chapulina commented 3 years ago

The test is flaky on Citadel:

https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering3-bionic-amd64/lastStableBuild/testReport/(root)/Camera_CameraTest/VisualAt_ogre2/history/

But it hasn't failed in Dome yet:

https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering4-bionic-amd64/lastStableBuild/testReport/(root)/Camera_CameraTest/VisualAt_ogre2/history/

chapulina commented 3 years ago

On #174, I'm adding more information to the test so we can understand better how it's failing.

chapulina commented 3 years ago

But it hasn't failed in Dome yet:

Now that there are more builds, it's possible to see that the test is flaky on Dome too. And on Edifice:

https://build.osrfoundation.org/job/ignition_rendering-ci-main-bionic-amd64/lastStableBuild/testReport/(root)/Camera_CameraTest/VisualAt_ogre2/history/

255 added a bit more debug information

chapulina commented 3 years ago

I suspect this may be related to the issue that #221 is trying to fix, which seems to be caused by a difference in devices.

See a summary of the lastest builds:

Jenkins node Passes Failures
drogon 0 5
optimus 5 1
r2d2 4 0

The test seems to pass consistently on r2d2 and fail on drogon.

Note that the only failure on optimus happened on April 1st, and all successes happened before March 17th. I believe that optimus recently got an upgrade that may have changed its settings in a way that causes the test to fail.


https://github.com/osrf/buildfarmer/issues/181

jacobperron commented 3 years ago

Another instance of the test failure for Citadel (from July 16th): https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering3-bionic-amd64/46

 58: [ RUN      ] Camera/CameraTest.VisualAt/ogre2
58: [Msg] Loading plugin [ignition-rendering-ogre2]
58: /var/lib/jenkins/workspace/ignition_rendering-ci-ign-rendering3-bionic-amd64/ign-rendering/test/integration/camera.cc:251: Failure
58: Expected: (nullptr) != (vis), actual: (nullptr) vs 16-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>
58: X: 300
58: /var/lib/jenkins/workspace/ignition_rendering-ci-ign-rendering3-bionic-amd64/ign-rendering/test/integration/camera.cc:259: Failure
58: Expected equality of these values:
58:   nullptr
58:     Which is: (nullptr)
58:   vis
58:     Which is: 16-byte object <20-0D 64-50 23-56 00-00 00-CC 24-50 23-56 00-00>
58: Found [box] at X [400]
58: /var/lib/jenkins/workspace/ignition_rendering-ci-ign-rendering3-bionic-amd64/ign-rendering/test/integration/camera.cc:265: Failure
58: Expected: (nullptr) != (vis), actual: (nullptr) vs 16-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>
58: X: 600
58: /var/lib/jenkins/workspace/ignition_rendering-ci-ign-rendering3-bionic-amd64/ign-rendering/test/integration/camera.cc:265: Failure
58: Expected: (nullptr) != (vis), actual: (nullptr) vs 16-byte object <00-00 00-00 00-00 00-00 00-00 00-00 00-00 00-00>
58: X: 700
58: [  FAILED  ] Camera/CameraTest.VisualAt/ogre2, where GetParam() = "ogre2" (387 ms)
jacobperron commented 3 years ago

Another instance: https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering3-bionic-amd64/50/testReport/junit/(root)/Camera_CameraTest/VisualAt_ogre2/

Blast545 commented 2 years ago

I took some time to gather metrics about how much this affects the ign-rendering builds across the ignition buildfarms. Taking into account only the builds with a test report available.

ignition_rendering-ci-ign-rendering3-bionic-amd64 has this error 14 out of 25 builds. 56.0% Flaky ignition_rendering-ci-ign-rendering4-bionic-amd64 has this error 13 out of 23 builds. 56.5% Flaky ignition_rendering-ci-ign-rendering5-bionic-amd64 has this error 10 out of 21 builds. 47.6% Flaky ignition_rendering-ci-main-bionic-amd64 has this error 9 out of 34 builds. 26.5% Flaky

I think it would be a good idea to take a closer look into this one, it's a test that fails frequently.

iche033 commented 2 years ago

I'm testing this in https://github.com/ignitionrobotics/ign-rendering/pull/450. Setting the device pixel ratio to 1.0f seems to have fixed the issue on the CI machines: https://github.com/ignitionrobotics/ign-rendering/blob/ign-rendering5/ogre2/src/Ogre2Camera.cc#L250

Blast545 commented 2 years ago

This error is failing consistently on our buildfarm ign-rendering3 job, see: https://build.osrfoundation.org/job/ignition_rendering-ci-ign-rendering3-focal-amd64/11/

Is it possible to backport this one? or should we disable the test on citadel and document it as a known issue? @chapulina

iche033 commented 2 years ago

if we want to backport, these are probably the relevant changes: https://github.com/gazebosim/gz-rendering/pull/446/files#diff-7d09d6004075a16fceed8b6665c62249a2f928e86838438073c983b1c2ad6ad7R107-R110

Blast545 commented 2 years ago

I took some time to check/backport this one, the link you attached here is showing me this:

src/Utils_TEST.cc

 EXPECT_EQ(0u, rayResult.objectId);
  VisualPtr root = scene->RootVisual();

Am I correct? To me it looks like the important changes are happening in the ogre2/src/Ogre2SelectionBuffer.cc, and the changes there were not trivial for me to backport.

iche033 commented 2 years ago

oh it should be:

  // the scaling factor seems to cause issues with mouse picking.
  // see https://github.com/ignitionrobotics/ign-gazebo/issues/147
#if 0

https://github.com/gazebosim/gz-rendering/blob/dc985f56fa4bc015891fd2f6617123286f7bf50e/src/Utils.cc#L108-L110

basically forcing that function to return 1.0

iche033 commented 1 year ago

issue is back on CI machine for gz-rendering7: https://github.com/gazebosim/gz-rendering/pull/758#issuecomment-1329246890

Blast545 commented 1 year ago

FYI @Crola1702