Closed dcommander closed 8 years ago
This is definitely related to LD_PRELOAD
. What seems to be happening is
that Steam adds ~/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:~/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so
to the existing LD_PRELOAD
variable set by vglrun, so the resulting LD_PRELOAD
variable is:
libdlfaker.so:libvglfaker.so:~/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:~/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so
Because I have no access to the gameoverlayrenderer.so source, I'm not sure exactly what that interposer is doing, but in my testing, it is clear that gameoverlayrenderer.so has to be preloaded ahead of VirtualGL for things to work properly. For instance, if I edit
~/.local/share/Steam/steamapps/common/dota 2 beta/game/dota.sh
and add
export LD_PRELOAD=~/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:~/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so:libdlfaker.so:libvglfaker.so
to the top, I can make Dota 2 launch and play correctly. This exposes a second issue whereby the game locks up when exiting it (still investigating that.)
As far as I can tell, the LD_PRELOAD
variable is modified within some non-hackable part of Steam. I think a good argument could be made that this is incorrect behavior on Steam's part. gameoverlayrenderer.so should be as close to the application in the preload order as possible, and other interposers should be placed after it.
Still investigating how I might be able to work around this within VirtualGL, but I wanted to share my findings thus far. Note that I also tried disabling the in-game overlay, in hopes that that would make Steam stop trying to preload gameoverlayrenderer.so, but no such luck. :|
Update:
The underlying cause of these problems is that gameoverlayrenderer.so interposes dlsym()
, glXGetProcAddress()
, and glXGetProcAddressARB()
, thus creating a situation similar to the one that used to exist when the nVidia 180.xx drivers were interposing dlsym()
(refer to https://sourceforge.net/p/virtualgl/mailman/message/22260702 and https://sourceforge.net/p/virtualgl/mailman/message/21377897 for historical context.) However, in this case, setting VGL_GLLIB
doesn't work around it. gameoverlayrenderer.so is closed-source, so it's difficult to ascertain exactly what's happening, but this seems to be an approximation:
glXGetProcAddress()
or glXGetProcAddressARB()
using dlsym(RTLD_NEXT, ...)
.
glXGetProcAddress()
and glXGetProcAddressARB()
, it has to use dlopen()
/dlsym()
to obtain pointers to the "real" versions of those functions. VGL subsequently calls the "real" glXGetProcAddress[ARB]()
function to obtain pointers to the "real" versions of other GLX/OpenGL functions. It does the latter because certain libGL implementations (specifically, this was known to be the case with the AMD Catalyst drivers, although I'm not sure if it still is) failed to properly expose some of the OpenGL PBO functions in the LD version script for libGL.so.1, so the only way to invoke those functions was by loading them using glXGetProcAddress[ARB]()
.dlsym()
, VirtualGL's call to dlsym(RTLD_NEXT, "glXGetProcAddress[ARB]")
returns a pointer to gameoverlayrenderer.so's interposed version of glXGetProcAddress[ARB]()
. Even if gameoverlayrenderer.so did not interpose dlsym()
, then this would still occur, because it is the next library in the dynamic link order after libvglfaker.so.glXGetProcAddress[ARB]()
to obtain a pointer to the "real" version of that function from libGL. However, VirtualGL's pointer to glXGetProcAddress[ARB]()
is actually pointing to the interposed version of that function in gameoverlayrenderer.so.glXGetProcAddress[ARB]()
in gameoverlayrenderer.so sees that it doesn't interpose the function that VirtualGL is requesting, so it apparently uses its own version of dlsym()
to try to obtain a pointer to the "real" function from libGL. For reasons that aren't fully understood, gameoverlayrenderer.so's version of dlsym()
returns VirtualGL's interposed version of the requested GLX/OpenGL function instead. My best guess is that gameoverlayrenderer.so's version of dlsym()
is not using the RTLD_NEXT
handle like it should.Things I tried:
VGL_GLLIB=libGL.so.1
, which causes VirtualGL to load GLX/OpenGL function symbols directly from libGL.so.1 rather than from the next library in the dynamic link order. In this case, when VirtualGL calls dlsym()
to obtain the address of glXGetProcAddress[ARB]()
, gameoverlayrenderer.so intercepts the dlsym()
call. A similar problem to the above occurs, whereby gameoverlayrenderer.so's version of dlsym()
returns a pointer to VirtualGL's interposed version of glXGetProcAddress[ARB]()
, for reasons that aren't fully understood..LD_PRELOAD
environment variable rather than the beginning. I had hoped that, with this modification, I could invoke LD_PRELOAD=~/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:~/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so vglrun -ldafter steam
and achieve a similar effect to the one I achieved by manually modifying the game launch scripts. However, Steam still adds ~/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so:~/.local/share/Steam/ubuntu12_64/gameoverlayrenderer.so
to the end of the LD_PRELOAD
variable, and things still go awry (because, since gameoverlayrenderer.so is still specified after the VGL fakers, RTLD_NEXT
still picks up symbols from that library instead of libGL.) This wouldn't have been a particularly good idea anyhow, since it would have caused gameoverlayrenderer.so to be preloaded into the entire Steam process rather than just the games.glXGetProcAddress[ARB]()
and, in turn, gameoverlayrenderer.so's version of glXGetProcAddress[ARB]()
called back VirtualGL's version of glXGetProcAddress[ARB]()
, VirtualGL would return the "real" OpenGL/GLX symbols rather than the fake ones. However, this revealed that gameoverlayrenderer.so is never actually calling VGL's version of glXGetProcAddress[ARB]()
. It is apparently implementing its version of glXGetProcAddress[ARB]()
using its own (presumably broken) version of dlsym()
.Avoiding the use of glXGetProcAddress[ARB]()
altogether in the VGL symbol loader, i.e. using the Solaris code in faker-sym.cpp, which works more similarly to the function loader in VGL 2.4.x. This allows the program to run up until the first call to glXSwapBuffers()
, but apparently VirtualGL's call to the "real" glXSwapBuffers()
function is intercepted by gameoverlayrenderer.so. Its version of glXSwapBuffers()
doesn't like being passed a Pbuffer handle and throws a GLX error, and for reasons not well understood, when gameoverlayrenderer.so throws this error, a segfault occurs:
#0 0x00000000 in ?? ()
#1 0xf7666856 in ?? ()
from /home/drc/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so
#2 0xf74b47c6 in _XError () from /lib/libX11.so.6
#3 0xf74b0ff6 in handle_error () from /lib/libX11.so.6
#4 0xf74b25c0 in _XReply () from /lib/libX11.so.6
#5 0xf74a7cf2 in XQueryTree () from /lib/libX11.so.6
#6 0xf766340a in ?? ()
from /home/drc/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so
#7 0xf765a177 in glXSwapBuffers ()
from /home/drc/.local/share/Steam/ubuntu12_32/gameoverlayrenderer.so
#8 0xf76c8317 in _glXSwapBuffers (drawable=23068674, dpy=0x94ac058)
at /home/drc/src/vglhead/server/faker-sym.h:368
#9 vglserver::VirtualDrawable::OGLDrawable::swap (this=0x95454d0)
at /home/drc/src/vglhead/server/VirtualDrawable.cpp:185
Setting VGL_TRAPX11=1
works around the segfault and allows execution to continue, but VirtualGL traps and reports numerous BadWindow and BadDrawable errors from gameoverlayrenderer.so. It does at least appear that the buffer is being swapped, despite these errors being thrown, but (predictably) the in-game overlay doesn't display.
NOTE: Trying this approach with VGL_GLLIB=libGL.so.1
doesn't work, because, per above, any call to gameoverlayrenderer.so's interposed version of dlsym()
will return a pointer to VirtualGL's interposed version of the requested function unless RTLD_NEXT
is specified as the handle.
Workarounds:
LD_PRELOAD
is set to a new value which places the VGL fakers after gameoverlayrenderer.so.
Problems:
VGL_DLSYM
) that, when set to 1
, will cause VGL to use the aforementioned Solaris code path, thus avoiding the use of glXGetProcAddress[ARB]()
to load GLX/OpenGL symbols. This, in combination with VGL_TRAPX11=1
, allows Steam games to run, but the in-game overlay doesn't work properly with this method. gameoverlayrenderer.so was designed to work properly when it intercepts calls specifically made by the Steam game, and VirtualGL was designed to work properly when it makes calls directly to libGL. In short, the only approach that allows all of the features in Steam to fully work is to reverse the LD_PRELOAD
order set within Steam, per above.Note that I can no longer reproduce the lock-up when exiting Dota. That may have been due to my own error.
Longer-term, my best advice would be to encourage Valve to change the order of LD_PRELOAD
so that gameoverlayrenderer.so is put ahead of any other interposers, or to allow such behavior to be configured with an environment variable. It seems that we're not the only ones having problems with gameoverlayrenderer.so (https://github.com/GhostSquad57/Steam-Installer-for-Wheezy/issues/37).
NOTE:
A better workaround is to set the launch options for each game to:
LD_PRELOAD="${LD_PRELOAD/libdlfaker.so:libvglfaker.so:/}:libdlfaker.so:libvglfaker.so" %command%
This at least allows the issue to be worked around without hacking the individual launch scripts, and it works for games without launch scripts. However, it's still not particularly user-friendly.
NOTE: unfortunately I'm now seeing the lock-up again when exiting games. :(
Ubuntu 16.04 can't use this workaround because of bash issue. A workaround for Ubuntu 16.04 is to set the launch options to:
LD_PRELOAD="${LD_PRELOAD#libdlfaker.so:libvglfaker.so:}:libdlfaker.so:libvglfaker.so" %command%
This launch option works well for dota2.
Refer to this thread. Specifically, this has been tested under Fedora 22 with the July 8, 2016 build of Steam (installed from the Fedora DNF repository) but probably affects all Linux platforms. It has specifically been tested with Dota 2 but probably affects most or all Steam games.
I've reproduced the problem, but I'm clueless as to what's causing it. Symptomatically, what's happening is that, when the VGL faker attempts to load a symbol from libGL using dlsym(), dlsym() returns the interposed symbol from the VGL faker instead. No idea why, but it seems that Steam is somehow interfering with VGL's function dispatching mechanism. When I've seen such problems in the past with other applications, I was able to work around them by setting
VGL_GLLIB=/usr/lib64/libGL.so.1
orVGL_GLLIB=/usr/lib/libGL.so.1
, which forces VirtualGL to load the "real" OpenGL functions directly from the underlying OpenGL library instead of relying on the dynamic loader to pick those symbols from the next library in the search order. That doesn't work with Steam, however.I've spent 15 hours of unpaid labor and am unfortunately no closer to solving this. I give up.