CuarzoSoftware / Louvre

C++ library for building Wayland compositors.
MIT License
506 stars 14 forks source link

Black screen except for the top right text on one monitor #42

Closed 2xsaiko closed 6 months ago

2xsaiko commented 6 months ago

Let me finally report this :^) I originally thought this was the same as https://github.com/CuarzoSoftware/Louvre/discussions/41 when I saw that, but I don't think it is.

Louvre isn't updating the screen correctly, one monitor has a black screen except for the info text. When the clock (or I assume any of the text on the top bar) updates, which of the ones has the broken background changes, like here, these are a minute apart: IMG_2326 IMG_2325

ehopperdietzel commented 6 months ago

Hmm weird, it seems like the scene damage tracking is not working properly, what happens if you move the cursor to the black screen and then change the scaling factor with Left Super + Left Shift + [Up / Down arrows]? That should damage the entire screen, also are the displays connected to different GPUs?

2xsaiko commented 6 months ago

Yup, that fixes it. The screens are connected to the same GPU (RX 6800 XT).

ehopperdietzel commented 6 months ago

Phew, great! 😅 I am going to check louvre-views. I think I made some modifications in the last release and maybe forgot to do a full damage when a new display is hotplugged.

ehopperdietzel commented 6 months ago

I tested louvre-views on all my devices, using 2 displays, and I didn't see this bug. Can you tell me the context in which this happens? The situations where I would expect this to happen are:

  1. Compositor Initialization
  2. Display Hotplugging
  3. Returning from another TTY
  4. Maybe when using triple buffering (SRM_RENDER_MODE_ITSELF_FB_COUNT=3)
2xsaiko commented 6 months ago

It happens when starting the compositor. Switching TTYs or unplugging the display actually fixes it. And it seems to not happen with triple buffering.

ehopperdietzel commented 6 months ago

Thank you, does this happen with louvre-weston-clone as well? Maybe SRM is failing to do the first page flip. If you run louvre-views with SRM_DEBUG=3, are there any error messages related to faulty page flips?

2xsaiko commented 6 months ago

It prints nothing with SRM_DEBUG=3. louvre-weston-clone I assume has the same problem but it affects it differently: the left monitor initially shows no time, and when it updates it is always one minute behind the right screen. It shows the correct time only while you mouse over the terminal icon in the top left on that monitor.

(this is just after starting it and doing nothing else, I haven't tried unplugging any monitor)

ehopperdietzel commented 6 months ago

It's weird because that doesn't happen to me, so I think the damage tracking should be okay. It seems like, for some reason, some of the first OpenGL calls have no effect in your case. Each time an output is initialized, a new rendering thread is created. So, maybe I should do an initial glFinish() or something similar to make sure the shared OpenGL context is fully initialized before doing rendering. I'll check that out. Thank you, by the way!

ehopperdietzel commented 6 months ago

I've added some glFinish() calls after initializing the backend, after output initialization, and following the rendering of the first frame in an output. Could you check if that fixes your issue? The changes are in the amd_gpu branch.

ehopperdietzel commented 6 months ago

It could be a driver issue. Perhaps AMDGPU doesn't handle multithreading properly yet.

2xsaiko commented 6 months ago

Looks like that patch fixes it, at least the original issue with louvre-views.

The issue with the clock in louvre-weston-clone stays the same though.

It might be a driver issue, is there a way to disable multithreaded rendering right now?

ehopperdietzel commented 6 months ago

Nice! Hopefully, it was just a context/thread initialization problem. Because if you use louvre-views, launch apps, drag them around, and such, do you see any black square glitches popping up? I am going to test louvre-weston-clone again. I remember seeing a similar problem with the clock, but that must be because that example does not use the scene system, so I had to implement damage tracking manually, which, of course, can lead to this kind of bug when not handled properly. Regarding your question, the answer is no, at least with the SRM backend, it is not possible to use a single thread only.

ehopperdietzel commented 6 months ago

I tested louvre-weston-clone, and the clock seems to work fine in my case. All the clocks for each display update simultaneously with the same values. Perhaps, in your case, the rendering commands may not be finishing in time when eglSwapBuffers() is called. However, according to this, calling glFinish() is not necessary. I will try adding glFlush() after rendering and before calling eglSwapBuffers() to see if it has any effect.

ehopperdietzel commented 6 months ago

I pushed the change to the amd_gpu branch.