anholt / libepoxy

Epoxy is a library for handling OpenGL function pointer management for you
Other
680 stars 161 forks source link

1.5.0 breaks KWin #160

Closed jonnyrobbie closed 6 years ago

jonnyrobbie commented 6 years ago

Updating libepoxy from 1.4.3 to 1.5.0 breaks KWin compositing. OGL 2 nor OGL 3.1 does work. More info can be found here: https://bugs.kde.org/show_bug.cgi?id=391486 and https://bbs.archlinux.org/viewtopic.php?id=235021.

Is this a regression or intentional change? It has been suggested to report upstream, which is here.

ebassi commented 6 years ago

Thanks for your report. Unfortunately, I don’t have an nvidia GPU using the binary blob driver, nor I use kwin, so I’ll need your help to investigate it.

It would be great if you could bisect epoxy between the 1.4.3 and 1.5.0 tags, to identify the regression. Reading the issues you linked seems to point to glvnd, but it would be good to be sure.

Additionally, it would be good to understand what the regression actually is, i.e. why is the nvidia driver breaking when testing for the glvnd interface first.

nwnk commented 6 years ago

I've added a testing request to the bug:

https://bugs.kde.org/show_bug.cgi?id=391486#c11

ebassi commented 6 years ago

@nwnk should we add epoxyinfo to epoxy proper, as a debugging tool?

jonnyrobbie commented 6 years ago

Ok, interesting, I tried doing git bisect. The catch is that 1.5.0 did not even build. Part of Arch's PKGBUILD is a check() routine meson check

1.5.0 failed

ninja: Entering directory `/var/cache/AUR/build/libepoxy/src/build'
ninja: no work to do.
 1/18 header_guards                           OK       0.01 s
 2/18 misc_defines                            OK       0.01 s
 3/18 khronos_typedefs                        OK       0.01 s
 4/18 egl_has_extension_nocontext             OK       0.02 s
 5/18 egl_gl                                  SKIP     0.04 s
 6/18 egl_gles1_without_glx                   SKIP     0.02 s
 7/18 egl_gles2_without_glx                   SKIP     0.01 s
 8/18 glx_beginend                            OK       0.20 s
 9/18 glx_public_api                          FAIL     0.22 s
10/18 glx_public_api_core                     FAIL     0.25 s
11/18 glx_glxgetprocaddress_nocontext         OK       0.10 s
12/18 glx_has_extension_nocontext             OK       0.09 s
13/18 glx_shared_znow                         FAIL     0.25 s
14/18 glx_alias_prefer_same_name              SKIP     0.27 s
15/18 glx_gles2                               FAIL     0.35 s
16/18 egl_and_glx_different_pointers_glx      FAIL     0.66 s
17/18 egl_and_glx_different_pointers_egl      SKIP     0.04 s
18/18 egl_and_glx_different_pointers_egl_glx  OK       0.37 s

OK:         8
FAIL:       5
SKIP:       5
TIMEOUT:    0

1.4.3 passed

ninja: Entering directory `/var/cache/AUR/build/libepoxy/src/build'
ninja: no work to do.
 1/18 header_guards                           OK       0.01 s
 2/18 misc_defines                            OK       0.01 s
 3/18 khronos_typedefs                        OK       0.01 s
 4/18 egl_has_extension_nocontext             OK       0.01 s
 5/18 egl_gl                                  SKIP     0.04 s
 6/18 egl_gles1_without_glx                   SKIP     0.01 s
 7/18 egl_gles2_without_glx                   SKIP     0.02 s
 8/18 glx_beginend                            OK       0.16 s
 9/18 glx_public_api                          OK       0.18 s
10/18 glx_public_api_core                     OK       0.26 s
11/18 glx_glxgetprocaddress_nocontext         OK       0.13 s
12/18 glx_has_extension_nocontext             OK       0.07 s
13/18 glx_shared_znow                         OK       0.24 s
14/18 glx_alias_prefer_same_name              OK       0.25 s
15/18 glx_gles2                               OK       0.25 s
16/18 egl_and_glx_different_pointers_glx      OK       0.26 s
17/18 egl_and_glx_different_pointers_egl      SKIP     0.04 s
18/18 egl_and_glx_different_pointers_egl_glx  SKIP     0.19 s

OK:        13
FAIL:       0
SKIP:       5
TIMEOUT:    0

git bisect says that the first offending commit is:

e5372a25baa9034b6223b32a0cab838c42779a39 is the first bad commit
commit e5372a25baa9034b6223b32a0cab838c42779a39
Author: Adam Jackson <ajax@redhat.com>
Date:   Thu Sep 7 17:02:22 2017 -0400

    dispatch: Fix the libOpenGL soname

    Brown-paper-bag-for: Adam Jackson <ajax@redhat.com>

:040000 040000 02efd85a11cf2cd2abac215723f6adc5fe67de40 223766d5368ca2368007f031570b8c4dfeb90f2a M      src

Full meson-logs/testlog.txt of failed check:

testlog.txt

hmhofman commented 6 years ago

Not sure if this helps any, but here it goes... Because of this (Arch BBS) report, I got to check the issues on libepoxy. Because I couldn't downgrade (had only the last package version on my pc), I tried re-installing libepoxy from AUR, because it listed as version libepoxy-git 1.4.0.r0.g9628670-1.

It removed libepoxy (libepoxy 1.5.0-1) from the extra repository and installed libepoxy-git:

[2018-03-08 09:51] [ALPM] transaction started [2018-03-08 09:51] [ALPM] installed xorg-util-macros (1.19.1-1) [2018-03-08 09:51] [ALPM] installed ninja (1.8.2-1) [2018-03-08 09:51] [ALPM] installed meson (0.45.0-1) [2018-03-08 09:51] [ALPM] transaction completed [2018-03-08 09:51] [ALPM] running 'systemd-update.hook'... [2018-03-08 09:51] [ALPM] transaction started [2018-03-08 09:51] [ALPM] removed libepoxy (1.5.0-1) [2018-03-08 09:51] [ALPM] installed libepoxy-git (1.5.0.r1.gc28759f-1) [2018-03-08 09:51] [ALPM] transaction completed [2018-03-08 09:51] [ALPM] running 'systemd-update.hook'...

This seems to fix the issue at hand. Question is: What are the differences between (Arch and AUR) versions 1.5.0-1 and 1.5.0.r1 ?

Hope this can help you on your way. I'd like to install the officially supported repository version ASAP again ;)

ebassi commented 6 years ago

@hmhofman that’s a question for Arch packagers.

@jonnyrobbie could you try to do what @nwnk suggested in the kwin bug?

jonnyrobbie commented 6 years ago

@ebassi @nwnk done. epoxyinfo on libepoxy150 segfaults. The remaining three results are posted there as attachments.

nwnk commented 6 years ago

@nwnk should we add epoxyinfo to epoxy proper, as a debugging tool?

Sure. I'll put a branch together at some point if nobody beats me to it.

michalsrb commented 6 years ago

@jonnyrobbie I have bisected the kwin issue and got the same breaking commit (e5372a25baa9034b6223b32a0cab838c42779a39). Reverting it fixes the issue.

The nvidia driver provides the libOpenGL.so.0 file. I think that before it failed to open it and continued to libGL.so.1. After the change it succeeds but is somehow broken.

jonnyrobbie commented 6 years ago

@ebassi what is the purpose of the offending commit? I have a feeling that simply reverting it is not the best option.

nwnk commented 6 years ago

The purpose is:

hyoscyamine:~% rpm -ql libglvnd-opengl
/usr/lib/.build-id
/usr/lib/.build-id/2a
/usr/lib/.build-id/2a/f15f1061e796bcd682fb2ddd7acc32cbdb1d68
/usr/lib64/libOpenGL.so.0
/usr/lib64/libOpenGL.so.0.0.0

Under glvnd we can avoid loading libGL.so.1, which we might very much like to do, because it pulls in libX11 and friends; instead we would load libOpenGL.so (and only libGLX.so if we can tell it's a GLX not EGL context) and that's what the patch attempts to do. So in that sense I think the patch is correct. and that something else is going wrong elsewhere. A backtrace from epoxyinfo from the broken configuration would still be useful.

jonnyrobbie commented 6 years ago

I apologize for having the issue all over two threads at the same time. I hope It's not that inconvenient. Anyway, here's the trace from segfaulted epoxyinfo with 1.5.0 libepoxy. Created by mostly following arch guide

trace.log

nwnk commented 6 years ago

That shows us getting a null GL extension string and feeding it to strstr. Arguably we shouldn't crash like that, but also a GL with a null extension string is not a thing (assuming you're not GL 1.0, and I promise you aren't), so what's really happening is the call to glGetString() is fizzling out and the "NULL" it returns is itself the problem.

I'm not entirely sure why that would happen, offhand. I'll try to come up with either another test or some trace code for epoxy itself.

LW-archlinux commented 6 years ago

Question is: What are the differences between (Arch and AUR) versions 1.5.0-1 and 1.5.0.r1 ?

aur libepoxy-git doesn't run any tests. There are also small differences in the meson setup between both packages, link time optimization is the biggest one ( used in libepoxy 1.5.0 )

NuLogicSystems commented 6 years ago

I'm having a similar issue on an Intel gpu using the open source drivers. Could this alternatively be caused by a bug in libglvnd 1.0.0-1?

nwnk commented 6 years ago

If someone experiencing this problem can test this patch, it would be much appreciated:

https://github.com/nwnk/libepoxy/commit/a8c3faaa1990d98047e3c566409200604105fa9c

hmhofman commented 6 years ago

@nwnk :+1: Cloned the current master branch (https://github.com/anholt/libepoxy.git), applied your patch by hand (just to make sure that patch is the only thing updated) Ran the install commands. It does not work on the fly. Rebooted the system. Now it seems to work.

Small side-node: nvidia-340xx drivers have also been updated on my system and this was the 1st boot since. So this might not prove to be the full (only) solution, but it might be. At least it DOES work on my system. KDE KWin compositor now runs on both OpenGL 3.1 and OpenGL 2.0

Here's my system:

Arch Linux
KDE Plasma: 5.12.4
KDE Frameworks: 5.44.0
Qt: 5.10.1
Kernel: 4.15.15-1-ARCH
Type OS: 6-bit

4x Intel Core i5-4430 CPU @3.00GHz
15.5 GiB RAM

2560 x 1024 pixels (765 x 302 mm)
85 x 86 dpi
Depth: 24, 1, 4, 8, 15, 16, 32

OpenGL (GLX & EGL)
NVIDIA Corporation
NVidia GT218 (GeForce 210)
GeForce 210/PCIe/SSE2
3.3.0 NVIDIA 340.106

hwinfo.txt glxinfo.txt

hmhofman commented 6 years ago

@nwnk Spoke too soon. While kwin/compositor does not crash anymore, some functions do not work. These include (but are not limited to)

Could it be that the compositor defaults back to XRender even though it says it is using OpenGL ? Before applying this patch, the compositor would crash on OpenGL. So for the last couple of weeks I was running XRender.

nwnk commented 6 years ago

Can you try this branch?

https://github.com/nwnk/libepoxy/tree/even-more-gentle-glx-detection

ebassi commented 6 years ago

The branch in question was merged.

No comment in 6 months ⇒ closing.