gazebosim / gz-gui

Builds on top of Qt to provide widgets which are useful when developing robotics applications, such as a 3D view, plots, dashboard, etc, and can be used together in a convenient unified interface.
https://gazebosim.org
Apache License 2.0
79 stars 44 forks source link

Deflake X-display tests on GitHub actions #58

Open chapulina opened 4 years ago

chapulina commented 4 years ago

Many ign-gui tests are failing like this on GitHub actions:

qt.qpa.screen: QXcbConnection: Could not connect to display 
Could not connect to any X display.

We should prevent these tests from running when no display is detected. As a reference, this is how Gazebo-classic detects it: https://github.com/osrf/gazebo/blob/6fd426b3949c4ca73fa126cde68f5cc4a59522eb/cmake/CheckDRIDisplay.cmake

j-rivero commented 4 years ago

We should prevent these tests from running when no display is detected

I would prefer to use a different approach that the one in Gazebo and add have an option in the build/test system to indicate that GUI tests needs to be compiled/executed or not. This way is easier to detect failures when the display is not working well since the build will fail instead of silently report success while hiding errors.

chapulina commented 3 years ago

X display tests have been fixed on GitHub actions on #98, but they're still flaky. This is the new error:

  [GUI] [Wrn] [Application.cc:649] [QT] could not connect to display :1.0
  [GUI] [Err] [Application.cc:653] [QT] This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

  Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, xcb.
chapulina commented 3 years ago

As a reference, ign-rendering also uses Xvfb and suffers from the same flakiness. Here's an example error message:

   [ RUN      ] Camera/CameraTest.RenderTexture/ogre2
  [Err] [Ogre2RenderEngine.cc:338] Unable to open display: :1.0
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:735]  Unable to create the rendering window
  [Err] [Ogre2RenderEngine.cc:742] Unable to create the rendering window

        Start  2: check_UNIT_Camera_TEST
   2/67 Test  #2: check_UNIT_Camera_TEST ................***Failed    0.03 sec

It doesn't always happen to the same test. I've not been able to identify a pattern (i.e. it's always the first test, etc). It looks like the display just can't be found for one test, but then it's found again.

I'm trying out different Xvfb arguments, and also trying to make the failure more verbose. The thing is that this failure isn't very common, so so it's hard to reproduce.


https://github.com/osrf/buildfarmer/issues/161

chapulina commented 3 years ago

Got a new error today that may help debug this a bit more:

   [ RUN      ] Scene3DTest.Events
  [GUI] [Wrn] [Application.cc:657] [QT] The X11 connection broke: Unknown error (code 80)
  XIO:  fatal IO error 2 (No such file or directory) on X server "0��LV"
        after 520 requests (520 known processed) with 0 events remaining.
  [GUI] [Wrn] [Application.cc:657] [QT] QObject::~QObject: Timers cannot be stopped from another thread
  [GUI] [Wrn] [Application.cc:657] [QT] QObject::~QObject: Timers cannot be stopped from another thread

It's possible that Xvfb is being killed due to high memory usage.

chapulina commented 2 years ago

Using EGL may solve this issue.

chapulina commented 2 years ago

have an option in the build/test system to indicate that GUI tests needs to be compiled/executed or not

We could revisit this idea and expose a CMake argument that lets us disable the tests which require a display on GitHub actions, but leave them enabled on Jenkins.

Another alternative that @mjcarroll brought up was to try using one of the other platform plugins suggested in one of the errors above:

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, xcb.
mjcarroll commented 2 years ago

Another option potentially? https://github.com/uwerat/qpagbm

Blast545 commented 2 years ago

This should be significantly improved after #419. Won't close this issue until the PR is forward ported and we have some confidence that there are no remaining odd failures on github actions.

Specific failures are to be tracked as this one: https://github.com/gazebosim/gz-gui/issues/421