Closed felixdoerre closed 6 years ago
@cyberconan Let's continue here to not spam the bumblebee issue. I think you problem is, that you use /usr/lib/libGLX_nvidia.so.0
as your nvDriver. Do you have a libGL.so.1
provided by nvidia? When I use libGLX
(/usr/lib/x86_64-linux-gnu/nvidia/current/libGLX_nvidia.so.0
under debian) the game that I run with primus-vk also crashes.
Hello! Good idea to continue here. I tried before with /usr/lib/libGL.so.1 but the error occurs even earlier when I'm going to change OpenGL to select Vulkan.
[dom sep 23 05:42:05 2018] dolphin-emu[3906]: segfault at 0 ip 0000000000000000 sp 00007fffc8f016c8 error 14 in dolphin-emu[5564de992000+e6000]
[dom sep 23 05:42:05 2018] Code: Bad RIP value.
[dom sep 23 05:42:05 2018] audit: type=1701 audit(1537674126.304:87): auid=1000 uid=1000 gid=1000 ses=1 pid=3906 comm="dolphin-emu" exe="/usr/bin/dolphin-emu" sig=11 res=1
[dom sep 23 05:42:05 2018] audit: type=1130 audit(1537674126.321:88): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-3942-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[dom sep 23 05:42:06 2018] nvidia-modeset: Unloading
[dom sep 23 05:42:06 2018] nvidia-nvlink: Unregistered the Nvlink Core, major device number 238
[dom sep 23 05:42:06 2018] bbswitch: disabling discrete graphics
[dom sep 23 05:42:06 2018] pci 0000:03:00.0: Refused to change power state, currently in D0
[dom sep 23 05:42:06 2018] audit: type=1131 audit(1537674127.117:89): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@4-3942-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
For fast debug, I tried vulkaninfo and vulkan-smoketest with identically results:
[dom sep 23 12:35:01 2018] vulkaninfo[10046]: segfault at 0 ip 0000000000000000 sp 00007ffea0c45b28 error 14 in vulkaninfo[55c6bd343000+1b000]
[dom sep 23 12:35:01 2018] Code: Bad RIP value.
[dom sep 23 12:35:01 2018] audit: type=1701 audit(1537698901.390:77): auid=1000 uid=1000 gid=1000 ses=1 pid=10046 comm="vulkaninfo" exe="/home/karl/Software/sources/primus_vk-master/vulkaninfo" sig=11 res=1
[dom sep 23 12:35:01 2018] audit: type=1130 audit(1537698901.400:78): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-10047-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[dom sep 23 12:35:01 2018] audit: type=1131 audit(1537698901.627:79): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@5-10047-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[dom sep 23 12:35:09 2018] vulkan-smoketes[10056]: segfault at 0 ip 0000000000000000 sp 00007ffe979d8a78 error 14 in vulkan-smoketest[55af8e516000+2b000]
[dom sep 23 12:35:09 2018] Code: Bad RIP value.
[dom sep 23 12:35:09 2018] audit: type=1701 audit(1537698909.303:80): auid=1000 uid=1000 gid=1000 ses=1 pid=10056 comm="vulkan-smoketes" exe="/home/karl/Software/sources/primus_vk-master/vulkan-smoketest" sig=11 res=1
[dom sep 23 12:35:09 2018] audit: type=1130 audit(1537698909.313:81): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@6-10057-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[dom sep 23 12:35:09 2018] audit: type=1131 audit(1537698909.907:82): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@6-10057-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Here speak about using libGLX_nvidia.so.0 in nvidia_icd.json. https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers/+bug/1725236
With libGLX I have debug info in vulkaninfo (not wit libGL):
Vulkan Instance Version: 1.1.70
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_parameter_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_object_tracker.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_core_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_unique_objects.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_threading.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_standard_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/nvidia_icd.json - The file has already been read once
PrimusVK: CreateInstance
PrimusVK: Getting devices
PrimusVK: 0x55eb0ebe3840: PrimusVK: got render!
PrimusVK: Device: GeForce 840M
PrimusVK: Type: 2
/build/vulkan-Kbdbga/vulkan-1.1.70+dfsg1/demos/vulkaninfo.c:770: failed with VK_ERROR_INITIALIZATION_FAILED
Ok, so then we should get vulkaninfo/vulkan-smoketest running first. So this looks like the problem is not related to dolphin-emu.
So then my next two guesses you could run to narrow down possible causes:
Are vulkaninfo/vulkan-smoketest running without primus-vk?
Do you have the mesa-vulkan drivers installed (i.e. /usr/share/vulkan/icd.d/intel_icd.x86_64.json
)?
From the output of vulkaninfo
that you gave, it seems that the internal device can't be found by primus-vk and so the VkDevice Object for the internal device can't be created.
This is the output with vulkan-intel:
===========
VULKAN INFO
===========
Vulkan Instance Version: 1.1.70
Xlib: extension "NV-GLX" missing on display ":0".
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_parameter_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_object_tracker.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_core_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_unique_objects.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_threading.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_standard_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/intel_icd.x86_64.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/nvidia_icd.json - The file has already been read once
INTEL-MESA: warning: Haswell Vulkan support is incomplete
Instance Extensions:
====================
Instance Extensions count = 17
VK_KHR_device_group_creation : extension revision 1
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities: extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2: extension revision 1
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
VK_KHR_display : extension revision 23
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 8
VK_EXT_direct_mode_display : extension revision 1
...
And this is the output if I remove mesa-vulkan:
===========
VULKAN INFO
===========
Vulkan Instance Version: 1.1.70
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_parameter_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_object_tracker.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_core_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_unique_objects.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_threading.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_standard_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/nvidia_icd.json - The file has already been read once
Xlib: extension "NV-GLX" missing on display ":0".
/build/vulkan-Kbdbga/vulkan-1.1.70+dfsg1/demos/vulkaninfo.c:2700: failed with VK_ERROR_INITIALIZATION_FAILED
And of course, with vulkan-intel I can run vulkan-smoketest.
PD: When I have both, intel and nvidia drivers, the output is this:
$ ENABLE_PRIMUS_LAYER=1 optirun ./vulkaninfo | more
===========
VULKAN INFO
===========
Vulkan Instance Version: 1.1.70
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_parameter_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_object_tracker.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_core_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_unique_objects.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_threading.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/explicit_layer.d/VkLayer_standard_validation.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/intel_icd.x86_64.json - The file has already been read once
WARNING: [Loader Message] Code 0 : Skipping manifest file /usr/share/vulkan/icd.d/nvidia_icd.json - The file has already been read once
PrimusVK: CreateInstance
PrimusVK: Getting devices
INTEL-MESA: warning: Haswell Vulkan support is incomplete
PrimusVK: 0x55b16d696ce0: PrimusVK: got render!
PrimusVK: Device: GeForce 840M
PrimusVK: Type: 2
PrimusVK: 0x55b16d696ce0: PrimusVK: got display!
PrimusVK: Device: Intel(R) Haswell Mobile
PrimusVK: Type: 1
PrimusVK: 0x55b16d696ce0 --> 0x55b16d696ce0
Instance Extensions:
====================
Instance Extensions count = 17
VK_KHR_device_group_creation : extension revision 1
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
So that looks good.
We need a configuration where vulkaninfo
shows both GPUs.
So with:
libnv_vulkan_wrapper.so
libGLX_nvidia
please)Does optirun vulkaninfo | grep "^GPU id"
show both GPUs?
What about optirun env DISPLAY=:8 vulkaninfo | grep "^GPU id"
does that show both GPUs?
With libGL.so no results, but with libGLX this results for optirun vulkaninfo (with DISPLAY=:8 and not):
INTEL-MESA: warning: Haswell Vulkan support is incomplete
GPU id : 0 (GeForce 840M)
GPU id : 1 (Intel(R) Haswell Mobile)
With ENABLE_PRIMUS_LAYER=1 optirun env DISPLAY=:8 ./vulkaninfo | grep "^GPU id"
INTEL-MESA: warning: Haswell Vulkan support is incomplete
GPU id : 0 (GeForce 840M)
And if I try ENABLE_PRIMUS_LAYER=1 optirun env DISPLAY=:8 ./vulkan-smoketest
INTEL-MESA: warning: Haswell Vulkan support is incomplete
2339 presents in 5.00123 seconds (FPS: 467.685)
2399 presents in 5.00129 seconds (FPS: 479.677)
And with ./vulkan-smoketest
Xlib: extension "NV-GLX" missing on display ":0".
INTEL-MESA: warning: Haswell Vulkan support is incomplete
41 presents in 5.03435 seconds (FPS: 8.14405)
40 presents in 5.08266 seconds (FPS: 7.8699)
That then looks all fine as well. Does vulkan-smoketest work with ENABLE_PRIMUS_LAYER=1 optirun ./vulkan-smoketest
(it does with libGLX on my system, but not the (more complex) game I run with wine)?
I see two possibilities in debugging this further:
So: could you run dolphin-emu with gdb and inspect/post a stacktrace of where the segfault appears?
The vulkan-smoketest works, but I can't see anything with DISPLAY=:8, only the fps counter in terminal console. About dolphin-emu, i'm using linux native version, no windows one.
Yes, but when you leave out DISPLAY=:8
, i.e. run ENABLE_PRIMUS_LAYER=1 optirun ./vulkan-smoketest
does it display the application or does it crash/abort.
About dolphin-emu: Yes. But it is still interesting in which sharedLibrary/callstack the Linux application crashes. Of course it will not contain winex11/libwine, but is it crashing in the vulkan path or in the GL-path? Is it crashing while initializing or while doing something else?
When I launch vulkan-smoketest and leave out DISPLAY=:8, crash with no console errors, only dmesg info posted previously.
About dolphin, I attached debug info I can see when try launch a game, before the crash. debug.txt
can you run vulkan-smoketest with gdb to provide a backtrace?
i.e. run ENABLE_PRIMUS_LAYER=1 optirun gdb vulkan-smoketest
enter run
and when gdb detects a segfault enter bt
to print a backtrace.
Other independent idea: When I select libGLX_nvidia
in nv_vulkan_wrapper
and have an application that uses both OpenGL and Vulkan (at least for discovery), I need to force OpenGL-Primus to also load libGLX_nvidia as backend for the nvidia driver. I do this with PRIMUS_libGLa=/usr/lib/libGLX_nvidia.so.0
. So can you try: ENABLE_PRIMUS_LAYER=1 PRIMUS_libGLa=/usr/lib/libGLX_nvidia.so.0 optirun vulkan-smoketest
(or the same with dolphin-emu)?
In order to have a better trace, can you attach the output of dolphin-emu
with VK_LOADER_DEBUG=info,warn,error LD_DEBUG=libs
? That should show what dynamic libraries are loaded and what the vulkan-loader thinks/complains about.
gdb output is the same with or without PRIMUS_libGLa:
#0 0x000055555556a3c5 in ?? ()
#1 0x000055555556c9db in ?? ()
#2 0x00005555555710db in ?? ()
#3 0x000055555555831d in ?? ()
#4 0x00007ffff78a3223 in __libc_start_main () from /usr/lib/libc.so.6
#5 0x000055555555849a in _start ()
About the last trace, is a very very large file ;-) debug2.txt Edit: Sorry, in previous debug file, dolphin was configured in OpenGL. This new one is with Vulkan and nVidia selected in dolphin graphics options.
From the trace:
2328: calling init: /usr/lib/libGLX_nvidia.so.0
2328: /usr/lib/libpthread.so.0: error: symbol lookup error: undefined symbol: pthread_setname_np, version GLIBC_2.2.5 (fatal)
2328: calling init: /usr/lib/libGLX.so.0
2328: calling init: /usr/lib/libGL.so.1
we can see that after the primus-libGL is loaded, primus loads libGLX. However directly after that the nvidia libGL is loaded. That seems really strange.
In debug2.txt
I can't see that vulkan is loaded at all. There is no reference to libvulkan
in that trace. The Stack trace looks like this is a segfault in dolphin and not in primus-vk or display drivers, so now would be the time to look for a dolphin with debug symbols....
However when vulkan-smoketest
still crashes (?) let's focus on that. Can you also see (with gdb), that it crashes in vulkan-smoketest
? Do you have debug symbols there?
Hi! I have found another testing program with more internal debugging. Instead crash, shows a black window:
$ ENABLE_PRIMUS_LAYER=1 optirun cube
PrimusVK: CreateInstance
PrimusVK: Getting devices
INTEL-MESA: warning: Haswell Vulkan support is incomplete
PrimusVK: 0x55c1e681a870: PrimusVK: got render!
PrimusVK: Device: GeForce 840M
PrimusVK: Type: 2
PrimusVK: 0x55c1e681a870: PrimusVK: got display!
PrimusVK: Device: Intel(R) Haswell Mobile
PrimusVK: Type: 1
PrimusVK: 0x55c1e681a870 --> 0x55c1e681a870
Support: e681a870d, 1
Support: e681a870d, 1
PrimusVK: in function: creating device
PrimusVK: spawning secondary device creation
PrimusVK: Thread running
PrimusVK: getting rendering suff: 0x55c1e681a870
PrimusVK: fetching dispatch for 0x55c1e6b92d70
PrimusVK: Create Swapchain KHR is: 0x7f4b445ef580
PrimusVK: CreateDevice done
PrimusVK: Gpus: 2
PrimusVK: phys[1]: 0x55c1e6b8d5e0
PrimusVK: render queues: 1
PrimusVK: flags: 7
PrimusVK: Creating Graphics: 0,
PrimusVK: in function: creating device
PrimusVK: joining secondary device creation
Then i found this and tried. Now intel is ignored but stop with new error message and finish execution without any popup window:
VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json ENABLE_PRIMUS_LAYER=1 optirun cube`
PrimusVK: CreateInstance
PrimusVK: Getting devices
PrimusVK: 0x55c3aa843be0: PrimusVK: got render!
PrimusVK: Device: GeForce 840M
PrimusVK: Type: 2
vkCreateInstance failed.
Do you have a compatible Vulkan installable client driver (ICD) installed?
Please look at the Getting Started guide for additional information.
The last library loaded is libprimus_vk.so. I have attached debug3 (with ICD variable) and debug4 (no ICD variable). Does it give you any clue? debug3.txt debug4.txt
Thank you very much for the trace files.
The (updated) debug2.txt
shows that primus-vk cannot be loaded into dolphin emu because it loads libvulkan with RTLD_LOCAL (see https://github.com/dolphin-emu/dolphin/blob/master/Source/Core/VideoBackends/Vulkan/VulkanLoader.cpp#L120).
This is currently listed in the repository's README.md as 4.II as wine has the same problem. I enhanced primus-vk with a workaround so that this problem should not occur anymore --> Please update to the current master and try to run dolphin-emu again with traces (the same way you created debug2.txt).
The ICD variable disables the intel driver. To use primus-vk however we need both drivers the intel and the nvidia driver working. So that debug3.txt
shows a crash is not surprising. With the new master the error message that kills the program should be improved to print out the problem.
debug4.txt
shows the initialization deadlock (issue #3). This problem can currently not be avoided, so that program has to be re-run until you are lucky and the deadlock does not occur. I am trying to find a real solution to this problem, but I didn't get any response yet on the corresponding issue from the maintainers of Vulkan-Loader.
Excellent! You've got it! Now I can run and see the smoketest!! debug5.txt About dolphin-emu, I can now launch a game, lower fps compared with OpenGL but Vulkan works!!
There is a problem when I change to fullscreen.
Here the debug file with new primus_vk.cpp launching a game in full screen and changing later to window mode. debug6.txt
Great! So thanks for debugging this so long with me.
The FPS seems to be capped at 30 and that time is spent in the memcpy. Yes, this is still not optimal :). However I expect the FPS not to go down that fast with higher graphics settings, as I think it is transfer and not general GPU usage.
Regarding the broken image: I was able to reproduce this issue (Just resizing vulkan-smoketest often). Vulkan can allocate images in a non-continous region (where there is space beween the image rows). I was lazy and didn't implement the transfer in that case correctly. This is corrected in the current master.
So: please update and try again :)
@felixdoerre, just perfect. You can consider the problem with dophin-emu solved. Thank you very much for your work with this implementation and a lot of encouragement with the optimizations to improve the fps, sure you get it.
I hope that soon in many distributions they include your code in their packages, surely in Archlinux they do not take a long time to include it in the AUR.
Thanks again!
Glad to hear that everything works now.
( copied from https://github.com/Bumblebee-Project/Bumblebee/issues/769#issuecomment-423752484) Hi felixdoerre! I'm trying to test your code in ArchLinux. I did these changes in these files:
Well, next I'm tryed to test with dolphin emulator launching with this: ENABLE_PRIMUS_LAYER=1 optirun dolphin-emu
Now I can change in graphic configuration OpenGL to Vulkan and shows in configuration my GeForce 840M, wow! But when I'm going to test a game dolphin crash with core dump. In dmesg I see this errors:
Any advice or is too soon to try launch a game with your Vulkan layer? Thanks!!!