felixdoerre / primus_vk

Vulkan GPU-offloading layer
BSD 2-Clause "Simplified" License
230 stars 18 forks source link

vkEnumerateInstanceExtensionProperties failed with ERROR_INITIALIZATION_FAILED #63

Closed bLuka closed 4 years ago

bLuka commented 4 years ago

Hello,

I'm trying to configure Vulkan to use my discrete graphics instead.

Independantly, Bumblebee and Vulkan are configured correctly.

Here's my glxinfo:

$ LD_LIBRARY_PATH="/usr/lib64/opengl/nvidia/lib/" pvkrun glxinfo | grep -i opengl | head -n2
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GTX 1050/PCIe/SSE2

Considering my vulkaninfo, running it natively, or with primusrun or optirun doesn't change the output. You can find my paste here.. vkcube is also doing fine on my Intel integreted GPU.

However, when trying to run vulkaninfo with pvkrun:

$ VK_ICD_FILENAMES=nvidia_icd.json pvkrun vulkaninfo  ERROR at  /var/tmp/portage/dev-util/vulkan-tools-1.2.137/work/Vulkan-Tools-1.2.137/vulkaninfo/vulkaninfo.h:240:vkEnumerateInstanceExtensionProperties  failed with ERROR_INITIALIZATION_FAILED

My /etc/vulkan/icd.d/nvidia_icd.json looks like this:

{
    "file_format_version" : "1.0.0",
    "ICD": {
        "library_path": "libnv_vulkan_wrapper.so",
        "api_version" : "1.1.70"
    }
}

I tried using both vulkan versions 1.1.125 and unstable 1.2.137 without difference.

I'm on Gentoo Linux 5.4.28.

What does this vkEnumerateInstanceExtensionProperties failed with ERROR_INITIALIZATION_FAILED means?

felixdoerre commented 4 years ago

First of all:

What does this vkEnumerateInstanceExtensionProperties failed with ERROR_INITIALIZATION_FAILED means?

This means you were trying to start a driver which refused to start. The driver is probably telling you that it didn't find its hardware or is in some other way unhappy about the current situation.

Generally, I am not really sure what you are doing/having installed/trying. Generally most of your commands look like you are using "primus" and others wrong. Have you compiled primus-vk manually? What paths have you specified? Where is you nvidia vulkan driver normally? So here are some guesses/explanations, what can improve your experiments:

With optimus hardware and primus-vk there is no need to use nvidia_icd.json at all. VK_ICD_FILENAMES=nvidia_icd.json will likely break anything you are trying, as for running with primus-vk you are explicitly selecting the only icd that will not work at all. To use primus vk, you will need to have nv_vulkan_wrapper.json and the mesa icd active otherwise primus vk cannot work.

Running vulkaninfo natively or with primusrun/optirun does not change the output:

When the nv_vulkan_wrapper.json is correctly installed (you can remove nvidia_icd.json as it will never be of any use and might sometimes prevent correct operation), you should see that the nvidia card reported additionally to the intel integrated graphics. The output lists two problems that can cause that:

ERROR: [Loader Message] Code 0 : libGLX_nvidia.so.0: cannot open shared object file: No such file or directory
ERROR: [Loader Message] Code 0 : libnv_vulkan_wrapper.so: cannot open shared object file: No such file or directory

The first message seems to stem for a misconfigured nvidia_icd.json. I'd suggest you remove that configuration file and try again. The second message seems to indicate that you did not correctly compile libnv_vulkan_wrapper.so and put in your library search path. This is required to have primus-vk working (or you have some other way to make the nvidia driver misbehave less).

So did you follow the installation instructions from here (https://github.com/felixdoerre/primus_vk#howto) ?

felixdoerre commented 4 years ago

Oops, missed your edit. So you have patched nvidia_json.icd? Or is that file additional to the one in /usr/share/vulkan/icd.d? Because having a file with the same name in /etc/ does not override an icd. This just adds an additional one. Did you adjust the nvidia driver path ( https://github.com/felixdoerre/primus_vk/blob/master/nv_vulkan_wrapper.cpp#L11 ) to the location specified in the original nvidia_icd.json? Did you put the libnv_vulkan_wrapper.so in a location where it will automatically be found?

bLuka commented 4 years ago

Thank you for your explanations

So did you follow the installation instructions from here (https://github.com/felixdoerre/primus_vk#howto) ?

Sure, I tried manually, and even tried cleaning everything and use the distributed ebuild package for Gentoo without any change in results.

To be more exact, after installing primus_vk, I followed this article which provides the nvidia_icd.json, among others.

It also suggests to remove nv_vulkan_wrapper.json as it leads to segfaults. It indeed leads to segfaults on my machine too. I need to pursue investigations if that's related to #61 or not.

So you have patched nvidia_json.icd? Or is that file additional to the one in /usr/share/vulkan/icd.d?

Just to be sure: /usr/share/vulkan/icd.d/ and /etc/vulkan/icd.d/ both behave the same way right? I only put mine in /etc, but moving it doesn't change anything.

Did you adjust the nvidia driver path

I did that too, yup

felixdoerre commented 4 years ago

Ok, then we probably should go into step-by-step debugging. What files do you have in the vulkan directories ( i.e. ls -als /{etc,usr/share}/vulkan/*.d) ? What is the content of all your icd-json-files? Is one of the files referencing libGLX_nvidia.so.0 directly? The loader could not find libnv_vulkan_wrapper.so in the paste you provided. Where did you install it? Is this in your normal shared-library search path? You can check if the wrapper is trying to be loaded with optirun -b none env LD_DEBUG=libs vulkaninfo.

Currently I do not assume that this is related to #61 but just a configuration/setup error on your system.

bLuka commented 4 years ago

With the given primus_vk.json and nv_vulkan_wrapper.json:

Here's more context:

─── Assembly ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Cannot access memory at address 0x0
─── Expressions ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── History ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Memory ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Registers ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   rax 0x00007fffffffc044     rbx 0x00005555555e7a60     rcx 0x00005555556d1060     rdx 0x0000000000000000     rsi 0x0000000000000000     rdi 0x00007fffffffc044     rbp 0x00007fffffffbfc0     rsp 0x00007fffffffbf58      r8 0x0000000000000000      r9 0x0000000000000008 
   r10 0x0000000000000020     r11 0x00007ffff7b76c80     r12 0x0000000000000000     r13 0x00005555555e76e0     r14 0x0000000000000000     r15 0x00007fffffffc048     rip 0x0000000000000000  eflags [ PF ZF IF RF ]         cs 0x00000033              ss 0x0000002b         
    ds 0x00000000              es 0x00000000              fs 0x00000000              gs 0x00000000         
─── Source ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
─── Stack ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[0] from 0x0000000000000000
(no arguments)
[1] from 0x00007ffff77d43c0 in vk_icdNegotiateLoaderICDInterfaceVersion+167 at nv_vulkan_wrapper.cpp:93
arg pSupportedVersion = 0x7fffffffc044
[+]
─── Threads ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
[1] id 18222 name vulkaninfo from 0x0000000000000000
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
>>> print(init)
$1 = {
  nvDriver = 0x5555556cc2d0,
  glLibGL = 0x5555555e7f20,
  libdl = 0x7ffff7f7b000,
  instanceProcAddr = 0x0,
  phyProcAddr = 0x0,
  negotiateVersion = 0x0
}
felixdoerre commented 4 years ago

Which driver path did you choose? It seems that the driver library, you chose is missing the vulkan driver symbols. This is what it should look like:

$ nm -D /usr/lib/x86_64-linux-gnu/nvidia/current/libGLX_nvidia.so.0  | grep vk
000000000009c050 T vk_icdGetInstanceProcAddr
000000000009bfe0 T vk_icdGetPhysicalDeviceProcAddr
000000000009bfb0 T vk_icdNegotiateLoaderICDInterfaceVersion

what symbols are present in the library, you provided to nv_vulkan_wrapper?

bLuka commented 4 years ago

What files do you have in the vulkan directories?

After a bit of cleaning, right now:

la /{etc,usr/share}/vulkan/*.d                                                                                                                                                                                                                                       02:29:33
/etc/vulkan/icd.d:
total 1,5K
drwxr-xr-x 2 root root 3 18 mai   01:35 ./
drwxr-xr-x 4 root root 4 16 mai   13:58 ../
-rw-r--r-- 1 root root 0 17 mai   04:59 .keep_media-libs_vulkan-loader-0

/etc/vulkan/implicit_layer.d:
total 2,0K
drwxr-xr-x 2 root root   3 17 mai   01:31 ./
drwxr-xr-x 4 root root   4 16 mai   13:58 ../
-rw-r--r-- 1 root root 643 17 mai   01:31 nvidia_layers.json

/usr/share/vulkan/explicit_layer.d:
total 2,0K
drwxr-xr-x 2 root root    3 17 mai   17:11 ./
drwxr-xr-x 6 root root    6 18 mai   02:09 ../
-rw-r--r-- 1 root root 1,8K 17 mai   17:11 VkLayer_khronos_validation.json

/usr/share/vulkan/icd.d:
total 5,0K
drwxr-xr-x 2 root root   5 18 mai   02:17 ./
drwxr-xr-x 6 root root   6 18 mai   02:09 ../
-rw-r--r-- 1 root root 146 16 mai   21:10 intel_icd.i686.json
-rw-r--r-- 1 root root 148 16 mai   21:16 intel_icd.x86_64.json
-rw-r--r-- 1 root root 146 18 mai   02:17 nv_vulkan_wrapper.json

/usr/share/vulkan/implicit_layer.d:
total 2,0K
drwxr-xr-x 2 root root   3 18 mai   02:17 ./
drwxr-xr-x 6 root root   6 18 mai   02:09 ../
-rw-r--r-- 1 root root 582 18 mai   02:17 primus_vk.json

What is the content of all your icd-json-files?

intel_icd.i686.json: 
{
    "ICD": {
        "api_version": "1.2.131",
        "library_path": "/usr/lib/libvulkan_intel.so"
    },
    "file_format_version": "1.0.0"
}

intel_icd.x86_64.json: 
{
    "ICD": {
        "api_version": "1.2.131",
        "library_path": "/usr/lib64/libvulkan_intel.so"
    },
    "file_format_version": "1.0.0"
}

nv_vulkan_wrapper.json: 
{
    "file_format_version" : "1.0.0",
    "ICD": {
        "library_path": "libnv_vulkan_wrapper.so.1",
        "api_version" : "1.1.84"
    }
}
bLuka commented 4 years ago

Which driver path did you choose? It seems that the driver library, you chose is missing the vulkan driver symbols.

I get it. I put the one provided by mesa: /usr/lib64/libGL.so.1. I bet that's the wrong one here haha

bLuka commented 4 years ago

It was! The one I needed was /usr/lib64/opengl/nvidia/lib/libGLX_nvidia.so, I was able to find it thanks to your previous comment!

Finally it works! Ugh, lot of headaches, but thank you very much!

Any donation link so I can pay you a beer? :D

felixdoerre commented 4 years ago

Hi, glad that it works now. Sorry, I don't have a donation link, but thanks for the offer.