ros-visualization / rviz

ROS 3D Robot Visualizer
BSD 3-Clause "New" or "Revised" License
831 stars 463 forks source link

RViz black screen during launch #1747

Closed tim-gros closed 2 years ago

tim-gros commented 2 years ago

During startup of rviz, rviz may show only a black screen. No error occurs and after some time (~2-3min), the GUI recovers and runs as normal. During the black screen time, some CPU cores show a load of 100%. Meanwhile the other ROS nodes run fine. This happens after certain modifications, such as:

When reverting the modifications, rviz runs again without high CPU load and no black screen. Overall, it seems as if the black screen occurs for random modifications. We could reproduce this issue on multiple computers. However, when packaging a "failing" rviz modification, the packaged version then runs without problems.

Screenshot from 2022-05-23 14-13-53

Your environment

rhaschke commented 2 years ago

From your description, I understand that the undesired behavior stems from a custom panel. Did you introduce any race conditions making rviz wait for some event? What exactly do you mean by packaging a failing rviz modification? If that works, you should search for differences between your test build and the "packaged build". What about debug vs. release build? Uninitialized variables?

tim-gros commented 2 years ago

Thanks a lot for your quick reply! Indeed, the modification was made in a custom panel, but it seems kind of random where the modification is made. We noticed that after rviz has recovered from the black screen, a similar thing happens when dragging a panel to a different position. Rviz freezes for a short time and becomes unresponsive but works again afterwards.

We think the strange thing is that no error message is shown and that the black screen disappears after a few minutes and everything looks normal again. Also, it seems as if rviz is running in the background, but only the rendering/GUI is not properly shown. We added several messages which are printed to the terminal as expected while the screen is black

packaging a failing rviz modification = creating a catkin package from code which produces this issue when only doing a catkin build

rhaschke commented 2 years ago

Random behaviour is often the consequence of uninitialized variables. Also, using different versions of a library within the same process results in such errors. I still don't get what you mean by "packaging a failing rviz modification". I guess your plugin defining the custom panel is part of some catkin package in your workspace, which - of course - is build with catkin build. What's the difference in this normal usage and your (working) packaging configuration?

tim-gros commented 2 years ago

Thank you very much for the hints. I have not yet found any uninitialized variables, but will continue looking for them. I am checking the libraries as well, but did not find conficting versions so far.

I suppose my explanation regarding the package was a bit unclear. What I meant to say was that we build a binary .deb file with catkin and CPack, to which I referred as package. This binary package is then installed on an identical setup. I mainly mentioned this to epathize the somewhat randomness of the issue. Sorry for the confusion.

rhaschke commented 2 years ago

If the strange behavior is not reproducible from a generated .deb (vs. your catkin workspace), maybe your workspace is screwed. Try to remove devel, build, and log and start over?

tim-gros commented 2 years ago

I completely cleaned the workspace and initialized it from scratch. However, this did not yield any improvement. Also, other ROS-nodes run without any problems. I think the really strange thing is, that it occurs as soon as the GUI is started or the GUI is changed, such as dragging a panel or adjusting the size of the RViz window if not in full screen. Unfortunately, I have still not figured out yet what truly causes this behaviour

rhaschke commented 2 years ago

You observe this strange behavior only with your custom plugins loaded, don't you?

tim-gros commented 2 years ago

Yes, I was not able to reproduce it with the basic rviz. However, I have just found that if I launch the basic rviz and then add all panels manually in the GUI (top left corner Panels->Add New Panel) and then save this configuration, rviz launches fine afterwards. So with this new default.rviz file it works fine, but the old one, which only differs in the order of the parameter, does not. Is there any specific order of the parameters in the .rviz file needed, that could cause the mentioned behaviour if not followed?

rhaschke commented 2 years ago

No, the order of panels and displays in the config file shouldn't matter. I think you should first identify the offending panel plugin, e.g. by removing panels one by one from your broken config. If you identified the offender, you can dig deeper into its code.

tim-gros commented 2 years ago

Update: I think we finally found the cause for this behavior. In one of the custom panels, there was a paint event from a Qt-Widget that was executed to often in some cases. This was responsible for the high CPU-usage and slowed down rviz so much that it just showed a black screen. Thanks @rhaschke for your help and tips.

rhaschke commented 2 years ago

Great that you found the culprit.