felixdoerre / primus_vk

Vulkan GPU-offloading layer
BSD 2-Clause "Simplified" License
229 stars 17 forks source link

Arch Linux vk_layer_utils.h is provided in vulkan-validation-layers package #11

Closed minecraft2048 closed 5 years ago

minecraft2048 commented 6 years ago

Without that package vk_layer_utils.h is missing

felixdoerre commented 6 years ago

Thanks for the info. So I only add that to README.md, or should I do something else?

minecraft2048 commented 6 years ago

Adding it to README.md would be nice, but it would be better if we have a full list of Arch Linux dependencies, or better yet an AUR package, although that will hit that icd.d problem

On Wed, Oct 17, 2018, 08:39 felixdoerre notifications@github.com wrote:

Thanks for the info. So I only add that to README.md, or should I do something else?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/felixdoerre/primus_vk/issues/11#issuecomment-430410670, or mute the thread https://github.com/notifications/unsubscribe-auth/ACVRRCqjmDsrAkTmMj50AgIgebGi5VVxks5ullIYgaJpZM4XiJl5 .

felixdoerre commented 6 years ago

Yes, @dontdieych seems to be working on the package (see #10). So what can I do to help that? Can you try out if my hacky idea from https://github.com/felixdoerre/primus_vk/issues/10#issuecomment-429707476 works?

dontdieych commented 6 years ago

oh, sorry I'm not working on pkging.

Thaodan commented 5 years ago

Is it still required to add that header to primus_vk.cpp?

I'm trying to run it on arch but get the: ERROR: [Loader Message] Code 0 : loader_scanned_icd_add: ICD libnv_vulkan_wrapper.so doesn't support interface version compatible with loader, skip this ICD.

during load. is there something I miss? i'm trying to replicate the rpm package for arch.

Thaodan commented 5 years ago

my pkg for arch: https://aur.archlinux.org/packages/primus-vk-git/

Thaodan commented 5 years ago

strace.vulkaninfo.log Strace log

felixdoerre commented 5 years ago

Is the interface version from primus_vk_wrapper.json identical to the from nvidia_icd.json? As the wrapper proxies almost all functions unchanged the interface version should be the same.

Thaodan commented 5 years ago

yes:

~/:cat /usr/share/vulkan/icd.d/nvidia_icd.json 
{
    "file_format_version" : "1.0.0",
    "ICD": {
        "library_path": "libGLX_nvidia.so.0",
        "api_version" : "1.1.84"
    }
}
█▓▒░bidar@hellion█▓▒░ (:|✔) 11:46:37
~/:cat /usr/share/vulkan/icd.d/primus_vk_wrapper.json 
{
    "file_format_version" : "1.0.0",
    "ICD": {
        "library_path": "libnv_vulkan_wrapper.so",
        "api_version" : "1.1.84"
    }
}
felixdoerre commented 5 years ago

Maybe running with VK_LOADER_DEBUG=info or VK_LOADER_DEBUG=all tells more why the loader thinks the wrapped driver is incompatible. Could you post the output of running the application with VK_LOADER_DEBUG=all?

Thaodan commented 5 years ago

vkloader.debug.log

Thaodan commented 5 years ago

vkloader.all.log

Thaodan commented 5 years ago

fixed the defined libGL path was wrong, I installed the old broken pkg of mine. But now I get: /build/vulkan-tools/src/Vulkan-Tools/vulkaninfo/vulkaninfo.c:3636: failed with VK_ERROR_INITIALIZATION_FAILED

Thaodan commented 5 years ago

https://paste.kde.org/pqxo9jrsn

felixdoerre commented 5 years ago

Generally: The paste looks like you have the corrected nvidia driver and intel driver installed multiple times (?). The expected output after "Getting devices" is that you have both GPUs exactly once.

The line number from the dump does not seem to fit: https://github.com/KhronosGroup/Vulkan-Tools/blob/sdk-1.1.92.0/vulkaninfo/vulkaninfo.c#L3636 even in the mentioned version, maybe there are special patches on the distibution you're using that move the line numbers. It would be useful to know where vulkaninfo fails. From the message you would expect an ERR_EXIT macro.

Thaodan commented 5 years ago

Because you used the wrong file its 1.1.85: https://github.com/KhronosGroup/Vulkan-Tools/blob/sdk-1.1.85/vulkaninfo/vulkaninfo.c#L3636

felixdoerre commented 5 years ago

Aah, your vulkan and vulkaninfo are different versions. Ok, then it's the problem mentioned here: https://github.com/felixdoerre/primus_vk/issues/15#issuecomment-440433997 We haven't hit that in a real game yet, and I didn't get to fixing that yet.

Thaodan commented 5 years ago

Its not only vulkaninfo that fails wow trough dxvk too. running primus-vk-diag trows this backtrace:

#0  0x00007fa191b29d7f in raise () from /usr/lib/libc.so.6
#1  0x00007fa191b14672 in abort () from /usr/lib/libc.so.6
#2  0x00007fa191ee058e in __gnu_cxx::__verbose_terminate_handler () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/vterminate.cc:95
#3  0x00007fa191ee6dfa in __cxxabiv1::__terminate (handler=<optimized out>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:47
#4  0x00007fa191ee6e57 in std::terminate () at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_terminate.cc:57
#5  0x00007fa191ee70ac in __cxxabiv1::__cxa_throw (obj=obj@entry=0x56281efa70a0, tinfo=0x7fa191fd8fd8 <typeinfo for std::logic_error>, dest=0x7fa191efca20 <std::logic_error::~logic_error()>) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/eh_throw.cc:95
#6  0x00007fa191ee26dd in std::__throw_logic_error (__s=0x7fa1915da0b0 "basic_string::_M_construct null not valid") from /usr/lib/libstdc++.so.6
#7  0x00007fa1915d930d in vk_icdNegotiateLoaderICDInterfaceVersion () from /usr/lib/libnv_vulkan_wrapper.so
#8  0x00007fa1920058dc in ?? () from /usr/lib/libvulkan.so.1
#9  0x00007fa19200c325 in vkCreateInstance () from /usr/lib/libvulkan.so.1
#10 0x000056281db9456e in VulkanContext::VulkanContext (this=0x7ffc566d0260) at primus_vk_diag.cpp:64
#11 0x000056281db95245 in main (argc=2, argv=0x7ffc566d03a8) at primus_vk_diag.cpp:209

EDIT: it only throws it without pvkrun.

pvk-diag output: ~/:pvkrun ./primus-vk-diag vulkan

PrimusVK-diagnostic: Creating Vulkan instance
PrimusVK: CreateInstance
PrimusVK: Getting devices
PrimusVK: 0x55ee61534e90: 
PrimusVK: got display!
PrimusVK: Device: Intel(R) HD Graphics 530 (Skylake GT2)
PrimusVK:   Type: 1
PrimusVK: 0x55ee61534e90: 
PrimusVK: got render!
PrimusVK: Device: Quadro M2000M
PrimusVK:   Type: 2
PrimusVK: 0x55ee61534e90: 
PrimusVK: got render!
PrimusVK: Device: Quadro M2000M
PrimusVK:   Type: 2
PrimusVK: 0x55ee61534e90: 
PrimusVK: got display!
PrimusVK: Device: Intel(R) HD Graphics 530 (Skylake GT2)
PrimusVK:   Type: 1
PrimusVK: 0x55ee61534e90: 
PrimusVK: got render!
PrimusVK: Device: Quadro M2000M
PrimusVK:   Type: 2
PrimusVK: 0x55ee61534e90: 
PrimusVK: got render!
PrimusVK: Device: Quadro M2000M
PrimusVK:   Type: 2
PrimusVK-diagnostic: Device: Quadro M2000M
PrimusVK-diagnostic:  Type: 2
PrimusVK-diagnostic:  API: 1.1.84
PrimusVK-diagnostic:    Queues: 2
PrimusVK-diagnostic: Destroying Vulkan: 0x55ee614c7a20
felixdoerre commented 5 years ago

The stacktrace looks like DISPLAY is not set (i.e. prev is nullptr in https://github.com/felixdoerre/primus_vk/blob/master/nv_vulkan_wrapper.cpp#L71) maybe that is not corretly set by the primus-vk-diag.sh-script, and re-launching it with pvkrun sets it correctly?

Thaodan commented 5 years ago

The first one was via ssh and DISPLAY set.

Thaodan commented 5 years ago

I tried vkquake and it mostly works except a crash at the end: https://paste.kde.org/pjf2fxwfa

backtrace:

#0  0x0000000000000000 in ?? ()
#1  0x00007fe0847ff274 in PrimusSwapchain::copyImageData(unsigned int, std::vector<VkSemaphore_T*, std::allocator<VkSemaphore_T*> >) () from /usr/lib/libprimus_vk.so
#2  0x00007fe0847ffb9e in PrimusSwapchain::present(PrimusSwapchain::QueueItem const&) () from /usr/lib/libprimus_vk.so
#3  0x00007fe0848019d1 in PrimusSwapchain::run() () from /usr/lib/libprimus_vk.so
#4  0x00007fe0891b5063 in std::execute_native_thread_routine (__p=0x55a45dc18180) at /build/gcc/src/gcc/libstdc++-v3/src/c++11/thread.cc:80
#5  0x00007fe09be35a9d in start_thread () from /usr/lib/libpthread.so.0
#6  0x00007fe09bd65b23 in clone () from /usr/lib/libc.so.6

wow crashes instantly after spawning a window: https://paste.kde.org/pizuoyyzs

felixdoerre commented 5 years ago

I've just pushed a stabilization (related to another game) onto master: https://github.com/felixdoerre/primus_vk/commit/61f41762bbdac58cafcf8974107b18141eae3de7 Could you update and test if that changes things? Also it would be nice to have a backtrace from the crashing WoW.

Thaodan commented 5 years ago

This was done after the commit. I'll look if I can find a backtrace. Sadly wow prevents wines crash handler.

felixdoerre commented 5 years ago

I usually just run wine with gdb (on debian I need to strip a wrapper script), so gdb --args /usr/lib/wine-development/wine64 path/to/game.exe, that usually catches segfaults.

Thaodan commented 5 years ago

Here is a minidump + backtrace from wow. wow_dmp_info.zip

Thaodan commented 5 years ago

Wow_dxgi.log

felixdoerre commented 5 years ago

Is the dxgi log without ENABLE_PRIMUS_LAYER=1? I think it would be best to get the system configuration under control such that vulkaninfo lists only the intel GPU, optirun vulkaninfo lists both GPUs and ENABLE_PRIMUS_LAYER=1 optirun vulkaninfo lists only the Nvidia GPU. Then the listing from dxvk (shown in dxgi.log) should be identical.

The installed nvidia driver should never contribute any physical devices as it refuses to load as the Display server on :0 does not have Nvidia extensions. The wrapped nvidia driver should provide a physical device when using optirun, as only then runs a display server on :8. ENABLE_PRIMUS_LAYER=1 should filter the intel GPU from the view of the application.

I am not sure how well the primus-vk code generally copes with multiple instances of those GPUs and how to choose one of them. Do they all behave identical?

Thaodan commented 5 years ago

vulkaninfo without optirun/primusrun (the latter crashes) only shows the intel gpu. With optirun/primusrun only the nvidia gpu is shown in vulkaninfo. I have no idea why dxvk shows both.

Thaodan commented 5 years ago

backtrace from dxvk+ path of exile: https://paste.kde.org/pdtjpjmf9 I think the issue comes from dxvk because even for poe two entrys per gpu are shown.

felixdoerre commented 5 years ago

I think the issue comes from your driver/vulkan/system-configuration. What I can read from the backtrace:

the primus layer does not seem to be active at all: It has "PrimusVK" trace messages all over, none of which are shown, you see more than one device (as PrimusVK explicitly only returns one device: https://github.com/felixdoerre/primus_vk/blob/master/primus_vk.cpp#L1124), and as it is not in the backtrace vkGetPhysicalDeviceSurfaceCapabilitiesKHR which the layer hooks: https://github.com/felixdoerre/primus_vk/blob/master/primus_vk.cpp#L1041. However the loader (https://github.com/felixdoerre/Vulkan-Loader/blob/master/loader/wsi.c#L238) is also missing, and that could denote some optimization that is taking place here. However I still find the missing PrimusVK output suspicious. Have you tried running ENABLE_PRIMUS_LAYER=1 optirun <command to run path of exile> explicitly?

felixdoerre commented 5 years ago

I would be nice to have a VK_LOADER_DEBUG of an application that actually sees both gpus, to understand why they show up multiple times. Do you get loader-debug output when running path of exile with VK_LOADER_DEBUG=all, so we can understand where those multiple devices come from, and if the primus layer loads/why it doesn't load?

Thaodan commented 5 years ago

I've runned path of exille with same command usenew is just a wine wrapper that calls wine with WINEPREFIX path set. The only difference is I used primus rather than virtualgl.

But I'll try again later with VK_LOADER_DEBUG=all.

Thaodan commented 5 years ago

The issue was dxvk wants Vulkan 1.1.x not 1.0.0. Fixing the json file of the layer fixed it.

pae.pvkrun.debug.1488.log

felixdoerre commented 5 years ago

So now the game runs with primus_vk? That's great! Thanks for finding this out.

Thaodan commented 5 years ago

Yes every dxvk game. However the performance is still worse. Need to find out where the bottle neck is. Is primus_vk vulkan 1.1 compatible?

felixdoerre commented 5 years ago

probably the only thing that would be non-compatible is the missing implementation of vkEnumeratePhysicalDeviceGroupsKHR. I am not aware of any other incompatibilities.

felixdoerre commented 5 years ago

I've just pushed a change that implements vkEnumeratePhysicalDeviceGroupsKHR, now I don't know of any more issues why primus_vk should not be vulkan 1.1 compatible. This should solve the error /build/vulkan-tools/src/Vulkan-Tools/vulkaninfo/vulkaninfo.c:3636: failed with VK_ERROR_INITIALIZATION_FAILED that is returned by vulkaninfo.

felixdoerre commented 5 years ago

I think there is nothing more to do here, so I close this issue.