ValveSoftware / steam-runtime

A runtime environment for Steam applications
Other
1.19k stars 86 forks source link

Proton 5.13-1 ignores Vulkan custom layers like frame limiter #295

Closed zaps166 closed 3 years ago

zaps166 commented 4 years ago

libstrangle and vk-layer-flimes no longler works with Proton 5.13-1.

ranplayer commented 4 years ago

mangohud as well

Galcian79 commented 4 years ago

I can confirm that mangohud works fine with proton 5.13-1 and vulkan, but it doesn't work anymore with dxvk.

Galcian79 commented 4 years ago

As suggested in another discussion, in order to re-enable the Vulkan layers, you need to bypass the LRS container by editing the file named _v2-entry-point in your SteamLinuxRuntime_soldier folder. After the line #!/bin/bash you need to add the lines

shift 4 exec "${@}"

This is a temporary fix and you need to redo it everytime LRS gets updated.

TiZ-HugLife commented 4 years ago

This completely breaks render offload on laptops with Optimus; it is provided by the optimus vulkan layer. This particular breakage means that an entire class of users can't play their games with any of the performance they're entitled to if they use Proton 5.13.

kisak-valve commented 4 years ago

Hello @HugLifeTiZ, funny you mention that, I tested Lara Croft and the Temple of Osiris with PRIME render offload a few minutes ago (__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia %command% in the game's launch options) and that did work to get it to switch over to the GTX 1060 in my test box. That was with nVidia 450.66. Optimus users are not completely out of options.

TiZ-HugLife commented 4 years ago

@kisak-valve, you put that inside the game's launch options? I set those variables when I launch steam. I just kind of figured I should so that it generates vulkan shaders for the right GPU before the game launches. Does the Linux Runtime that Proton 5.13 is built against reset environment variables?

EDIT: Also, if that's going to be the supported method going further, are you going to expose GUI options within Steam to run games on the discrete GPU by setting the various environment vars? I feel like you guys aren't the kind of people who will tell all optimus users to muck with launch options for all of their graphically intensive games for the foreseeable future.

EDIT 2: That didn't work for me. I tried SoulCalibur VI, and the entire game was insanely choppy before I even hit the title screen. I have nVidia 450.80; current version in Xubuntu 20.04.

NerosTie commented 4 years ago

I have Optimus (Nvidia 730M) + PRIME render offload and I have no issue with Proton 5.13 about this. It works as with previous versions.

Araly commented 3 years ago

I have the same issue as @HugLifeTiZ , I have an optimus laptop, and on 5.13, games run on the CPU. I can see them appear in nvtop, but the GPU says 0%, and the GPU stays cold, while the CPU gets hot, and performance is abysmal.

@kisak-valve 's solution doesn't work for me. I usually run the whole steam in __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME="nvidia" __VK_LAYER_NV_optimus="NVIDIA_only" to run games on the GPU. I've tried that, and setting those variables in the game's launch options as mentioned, preceding %command%, but both don't make the games run on the GPU, when using 5.13.

I'm using nvidia 455 drivers for a GTX 1050, and https://github.com/Askannz/optimus-manager in hybrid mode (so the GPU is mostly inactive, unless I run an app with the previously mentioned variables.

EDIT: following comments I found on reddit, I moved every file except for /usr/share/vulkan/icd.dnvidia_icd.json in that folder, and that seems to work for now. I half expect that solution to go bad at some point, but it seems to be good enough for now. Apparently, the intel iGPU gets loaded before the nvidia GPU can, because its file is read first.

kisak-valve commented 3 years ago

Tracking note: I've transferred this issue to the steam-runtime issue tracker because it's an issue with Pressure Vessel bringing system-wide Vulkan layers into the Steam Linux Runtime - Soldier container environment.

mcoffin commented 3 years ago

@kisak-valve and others, is there a temporary workaround (probably another way to set layers), that we can use while you all work on getting this resolved?

Thanks in advance!

zaps166 commented 3 years ago

@mcoffin https://github.com/ValveSoftware/steam-runtime/issues/295#issuecomment-710683545

GloriousEggroll commented 3 years ago

Just throwing this in as well -- MESA_VK_DEVICE_SELECT uses a vulkan layer to switch between devices on mesa. Without it laptops with intel+amd or amd+amd are stuck using the iGPU for native vulkan games such as DOOM and Strange Brigade as those games default to the iGPU with no in-game options to change it.

Samsagax commented 3 years ago

As suggested in another discussion, in order to re-enable the Vulkan layers, you need to bypass the LRS container by editing the file named _v2-entry-point in your SteamLinuxRuntime_soldier folder. After the line #!/bin/bash you need to add the lines

shift 4 exec "${@}"

This is a temporary fix and you need to redo it everytime LRS gets updated.

I did this and the game AOE 2: DE wont start playing, it will stuck at summary screen before game. I assume is because I'm bypassing the entire runtime setup and there is some missing library in my system.

EDIT: following comments I found on reddit, I moved every file except for /usr/share/vulkan/icd.dnvidia_icd.json in that folder, and that seems to work for now. I half expect that solution to go bad at some point, but it seems to be good enough for now. Apparently, the intel iGPU gets loaded before the nvidia GPU can, because its file is read first.

I did this and it worked fine (same game). The files there were:

ls /usr/share/vulkan/icd.d/
intel_icd.i686.json  intel_icd.x86_64.json  nvidia_icd.json

Seems like intel ones are loaded first regardless of which card the environment is meant to use.

GloriousEggroll commented 3 years ago

https://github.com/ValveSoftware/Proton/issues/4289#issuecomment-725343211

This method allows vulkan layers, overlays, etc to work while still running inside the container:

https://www.reddit.com/r/linux_gaming/comments/jc2b77/mangohud_workaround_for_proton_513/gbx4cz6/?utm_source=reddit&utm_medium=web2x&context=3

plasticbomb1986 commented 3 years ago

Just throwing this in as well -- MESA_VK_DEVICE_SELECT uses a vulkan layer to switch between devices on mesa. Without it laptops with intel+amd or amd+amd are stuck using the iGPU for native vulkan games such as DOOM and Strange Brigade as those games default to the iGPU with no in-game options to change it.

Little note here: @aejsmith vkdevicechooser is working, still let me override vulkan gpu device selection.

TiZ-HugLife commented 3 years ago

I've tried adapting @GloriousEggroll's instructions on Xubuntu 20.04 in order to get back MangoHud and Optimus, but the only thing that works is completely neutering the containerization as mentioned by this comment.

TTimo commented 3 years ago

soldier runtime >= 0.20201124.0 now imports vulkan layers as well, which should help with this problem. See https://steamcommunity.com/app/221410/discussions/2/2962768718547168164/ for details.

DadSchoorse commented 3 years ago

soldier runtime >= 0.20201124.0 now imports vulkan layers as well, which should help with this problem. See https://steamcommunity.com/app/221410/discussions/2/2962768718547168164/ for details.

I tested it with vkBasalt and there still seem to be issues: While it's working with 64bit games, the loader tries to load the wrong shared library for 32bit. Here's log of Besiege with ENABLE_VKBASALT=1 VK_LOADER_DEBUG=all PROTON_LOG=1 %command% steam-346010.log Relevant section:

ERROR: /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/10/steamoverlayvulkanlayer.so: wrong ELF class: ELFCLASS64
DEBUG: Loading layer library /overrides/lib/i386-linux-gnu/vulkan_imp_layer/9/steamoverlayvulkanlayer.so
INFO: Insert instance layer VK_LAYER_VALVE_steam_overlay_32 (/overrides/lib/i386-linux-gnu/vulkan_imp_layer/9/steamoverlayvulkanlayer.so)
ERROR: /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/8/libVkLayer_steam_fossilize.so: wrong ELF class: ELFCLASS64
DEBUG: Loading layer library /overrides/lib/i386-linux-gnu/vulkan_imp_layer/7/libVkLayer_steam_fossilize.so
INFO: Insert instance layer VK_LAYER_VALVE_steam_fossilize_32 (/overrides/lib/i386-linux-gnu/vulkan_imp_layer/7/libVkLayer_steam_fossilize.so)
ERROR: /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/6/libbathingshots.so: wrong ELF class: ELFCLASS64
ERROR: /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/4/libVkLayer_MESA_device_select.so: wrong ELF class: ELFCLASS64
ERROR: /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/1/libvkbasalt.so: wrong ELF class: ELFCLASS64

Full system info as a gist

My educated guess on what's going on: The runtime creates two layers for each layer, one with the 32bit path and one with the 64bit path. Both have the same name.

      {
        "json_path" : "/overrides/share/vulkan/implicit_layer.d/1-i386-linux-gnu.json",
        "name" : "VK_LAYER_VKBASALT_post_processing",
        "description" : "a post processing layer",
        "type" : "GLOBAL",
        "api_version" : "1.2.136",
        "implementation_version" : "1",
        "library_path" : "/overrides/lib/i386-linux-gnu/vulkan_imp_layer/1/libvkbasalt.so"
      },
      {
        "json_path" : "/overrides/share/vulkan/implicit_layer.d/1-x86_64-linux-gnu.json",
        "name" : "VK_LAYER_VKBASALT_post_processing",
        "description" : "a post processing layer",
        "type" : "GLOBAL",
        "api_version" : "1.2.136",
        "implementation_version" : "1",
        "library_path" : "/overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/1/libvkbasalt.so"
      },

But iirc I think I've run into an issue while developing layers in the past: The vulkan loader only uses one manifest json for each name. So in this case it's probably the 64bit json and the 32bit json gets ignores.

Let me know if I you need additional information, and thanks for trying to support layers, they are imo a really useful vulkan feature.

smcv commented 3 years ago

The vulkan loader only uses one manifest json for each name. So in this case it's probably the 64bit json and the 32bit json gets ignores.

What does the manifest JSON look like on the host system? Is there a single JSON file shared between 32- and 64-bit, using special tokens like $LIB and $PLATFORM, or is there one JSON file per word size?

It's difficult for us to share a single JSON file between word-sizes even if the version on the host system did that successfully, because we will sometimes be using the glibc from the host system and sometimes the glibc from the runtime, which means we can't know where to put libraries to have $LIB pick them up: depending whose glibc we are using, it might be lib64 and lib (Red Hat-derived), or lib and lib32 (Arch-derived), or lib/x86_64-linux-gnu and lib/i386-linux-gnu (Debian-derived), or something else entirely.

For one of the ICD loaders (I think it's VDPAU?) we have a horrible, horrible hack involving $PLATFORM, but I'd prefer to use that as little as possible (the $PLATFORM for 32-bit PCs can be i386, i486, i586 or i686, depending on CPU, OS and phase of the moon).

DadSchoorse commented 3 years ago

What does the manifest JSON look like on the host system? Is there a single JSON file shared between 32- and 64-bit, using special tokens like $LIB and $PLATFORM, or is there one JSON file per word size?

It's one json with "library_path": "libvkbasalt.so", libvkbasalt.so is in standard system locations (/usr/local/lib/libvkbasalt.so, /usr/local/lib32/libvkbasalt.so). But there's a build time option to use an absolute path with $LIB and mangohud also does that.

smcv commented 3 years ago

Is it valid to change the name of a layer, or is the name referenced elsewhere?

If we can change the name freely, then we could maybe avoid this problem by turning your VK_LAYER_VKBASALT_post_processing into VK_LAYER_VKBASALT_post_processing_x86_64-linux-gnu and VK_LAYER_VKBASALT_post_processing_i386-linux-gnu. But if it's referenced elsewhere, then we'd be breaking that connection by renaming it.

For libraries that are in the system library search path, it should be possible to keep the different architectures' JSON manifests unified and make them refer to the library by its basename - it will take some work to achieve that, but probably a lot less than dealing with $LIB.

DadSchoorse commented 3 years ago

Is it valid to change the name of a layer, or is the name referenced elsewhere?

The name is referenced else where, but that's mostly only an issue for explicit layers (e.g. you enable them with VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation or with the name in vkCreateInstance). For implicit layers I'm yet to find an example where the name is actually relevant. Imo, the best solution is to fix this problem in the Vulkan Loader, so that two jsons with the same layer name work.

smcv commented 3 years ago

The runtime creates two layers for each layer, one with the 32bit path and one with the 64bit path. Both have the same name.

This specific part of the problem is now tracked at https://gitlab.steamos.cloud/steamrt/steam-runtime-tools/-/issues/39.

DadSchoorse commented 3 years ago

Layers are now working correctly for me since vulkan-icd-loader 1.2.169-1.

RyuzakiKK commented 3 years ago

Layers are now working correctly for me since vulkan-icd-loader 1.2.169-1.

Thank you very much for the confirmation. The next missing thing is backporting it to Soldier and Scout, so that Vulkan layers should work as expected even if the host system still uses an older Vulkan-Loader.

smcv commented 3 years ago

A new beta released today contains Vulkan-Loader version 1.2.169, which should resolve at least part of this, and maybe all of it. For details of how to try the beta, please see https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md#using-a-beta-or-an-older-version.

To have this change, SteamLinuxRuntime_soldier/VERSIONS.txt should say you have soldier version 0.20210217.0 or later.

Please revert any workarounds you have applied before re-testing.

With this new version, I have been able to get MangoHUD to display in both 32- and 64-bit Proton/Wine/DXVK games. Here are some example free-to-play games to try:

The two 32-bit games have native Linux versions, so to make Steam run the Windows binaries under Proton/Wine/DXVK, you have to use right-click -> Properties... -> Compatibility -> Force the use of a specific Steam Play compatibility tool and select Proton 5.13-6.

smcv commented 3 years ago

Please note that because this version is better at enabling Vulkan layers, it might trigger new bugs involving a combination of multiple Vulkan layers, which were previously hidden by not all of the layers getting enabled. This might be specific to particular Mesa versions, it isn't entirely clear yet.

In particular, there seems to be a problematic interaction between MangoHUD and the Mesa device selection layer that would not always have been visible before this beta because MangoHUD was not reliably enabled, but becomes visible in this beta. Workaround: either disable MangoHUD with DISABLE_MANGOHUD=1, or disable the Mesa device selection layer with NODEVICE_SELECT=1. See #363 and #365 for more details.

Leopard1907 commented 3 years ago

Fwiw, mesa device select layer is fixed on Mesa master and fix is available on Kisak Mesa PPA too for Ubuntu based system users.

https://gitlab.freedesktop.org/mesa/mesa/-/commit/38ce8d4d00c2b0e567b6dd36876cf171acb1dbc7

https://launchpad.net/~kisak/+archive/ubuntu/kisak-mesa

smcv commented 3 years ago

mesa device select layer is fixed on Mesa master

Yes, we've had a report on another issue that this change avoids the bad interaction. However, this is not something we can fix from within the Steam Runtime - it has to come through your distro (or third-party PPAs etc.) as part of Mesa - so the speed of getting that change deployed is not under our control.

Leopard1907 commented 3 years ago

Ofc, just adding as a note so users who are able to opt into those noted options but didn't yet can do it.

Yes, imagine getting this fix on Debian Stable. 🐸

Leopard1907 commented 3 years ago

At least on my limited testing (Doom 2016) , Nvidia Prime Render Offload works correctly now without disabling runtime. For both GL and VLK.

smcv commented 3 years ago

At least on my limited testing (Doom 2016) , Nvidia Prime Render Offload works correctly now without disabling runtime.

Thanks! I was hoping this might fix some or all of the multi-GPU use-cases as a side-effect.

What is "now"? Is it soldier version 0.20210217.0? (Please check SteamLinuxRuntime_soldier/VERSIONS.txt)

Leopard1907 commented 3 years ago

#Name Version Runtime Runtime_Version Comment SteamLinuxRuntime v0.20210114.1-2-g2bd4ab9 # Entry point scripts, etc. pressure-vessel 0.20210203.0+srt1 scout 0.20210217.0 # pressure-vessel-bin.tar.gz soldier 0.20210217.0 soldier 0.20210217.0 # com.valvesoftware.SteamRuntime.Platform-amd64,i386-soldier-runtime.tar.gz

I use Steam with these:

__NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only

http://us.download.nvidia.com/XFree86/Linux-x86_64/460.56/README/primerenderoffload.html

tgurr commented 3 years ago

For me this is still not working with the latest beta runtimes

#Name   Version     Runtime Runtime_Version Comment
SteamLinuxRuntime   v0.20210309.0-0-gb38a1fb            # Entry point scripts, etc.
pressure-vessel 0.20210305.0+srt1   scout   0.20210309.0    # pressure-vessel-bin.tar.gz
soldier 0.20210309.0    soldier 0.20210309.0    # com.valvesoftware.SteamRuntime.Platform-amd64,i386-soldier-runtime.tar.gz

tested with Life Is Strange 2 (64-bit) mentioned above and vkBasalt (with effects = monochrome to easily spot if things are working or not):

Launch command: ENABLE_VKBASALT=1 VK_LOADER_DEBUG=all PROTON_LOG=1 %command%

Operating System: Exherbo Linux

Notable error from the logs:

ERROR: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/libvkbasalt.so)

glibc is at 2.33, both glibc and vkBasalt are compiled from source and are working fine in combination with Proton 5.0-10 or plain ENABLE_VKBASALT=1 vkcube.

smcv commented 3 years ago

ERROR: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /overrides/lib/x86_64-linux-gnu/vulkan_imp_layer/libvkbasalt.so)

@tgurr, please could you open a separate issue for this, mentioning version GLIBCXX_3.4.26 not found in the title, and provide all the information requested in https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md#essential-information in that issue report?

The overall issue - "Vulkan layers don't always load" - is huge and complicated, with several root causes, and we are not going to be able to fully solve it any time soon. However, if we can disentangle your specific version of this issue into a separate issue report and you give us the required information to understand it, then we can hopefully solve that sooner.

tgurr commented 3 years ago

@smcv thanks, done so here: https://github.com/ValveSoftware/steam-runtime/issues/381

RalfJung commented 3 years ago

I have an optimus laptop, and on 5.13, games run on the CPU. I can see them appear in nvtop, but the GPU says 0%, and the GPU stays cold, while the CPU gets hot, and performance is abysmal.

I have exactly the same symptoms. Everything worked fine two weeks ago (2021-03-08) without any special setup (games picked up the dGPU / optimus automatically), and I am reasonably sure that I did not install any system updates since then (I am not Debian testing), but now it stopped working -- so I guess it is caused by a Steam Runtime update. I seem to be on version v0.20210309.0-0-gb38a1fb of said runtime. I tried using Proton 5.0 instead of 5.13; that made no difference.

I was able to fix this by adjusting the game launch settings to __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia __VK_LAYER_NV_optimus=NVIDIA_only %command%. The __VK_LAYER_NV_optimus part is important here; I had previously not included this in my optimus setup and that was insufficient.

smcv commented 3 years ago

Let's try not to have this issue report drift away from the topic that was originally reported, because if people repurpose the same issue number for different problems, then we'll end up with an issue that we can never close because it means too many different things, and that doesn't help anyone to get their games running well.

The container runtime is used by Windows games with Proton 5.13+, and native Linux games with compatibility tool Steam Linux Runtime specifically selected. It is not used by Windows games with Proton 5.0 or older, and native Linux games with no special options.

This issue, #295, is about Vulkan layers not getting loaded into games that run in the container runtime. This is a lot more straightforward to recognise than it is to solve, because Vulkan layers are complicated and there are multiple root causes. We're gradually improving the situation over time.

There is a related issue, #312, which is: some users with dual GPUs (NVIDIA Optimus, AMD Switchable Graphics) find that games that run in the container runtime use the integrated GPU, while games that do not run in the container runtime use the discrete GPU. This is also much easier to recognise than it is to solve, because there are multiple root causes (including #295).

@RalfJung seems to be experiencing something different: games are running on the integrated GPU even when the game is not running in the container runtime. I have already asked @RalfJung to open a separate issue report for this.

If you are not sure whether to open a separate issue report or reply to an existing issue report, it is usually best to open a new issue report, especially for complicated, long-running issues like this one. Please make sure to include the information requested in https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md - we cannot solve problems unless you give us the clues we need.

I know some projects have different rules, like Proton asking for only one issue report per game - but steam-runtime is not working at the same layer as Proton and we need different things from you if we are going to be able to solve problems. If we get multiple reports with the same root cause, we can close some of them as duplicates, but if we get multiple problems mixed up in one issue report, we end up spending a lot of time keeping track of which parts are fixed and which parts aren't fixed, and every hour we spend doing that is an hour we can't spend on fixing the code.

at46 commented 3 years ago

Since some improvements were implemented in Mesa and the Soldier Runtime, I tested some games again with Proton 5.13 and libstrangle/mangohud. While there is indeed some progress the results are a bit strange. First off mangohud seems to work in all games/configs I tested which is great. Unfortunately it is not the same for libstrangle.

Proton 5.13 with libstrangle (strangle 55 %command%):

Proton 5.13 with libstrangle and mangohud (strangle 55 mangohud %command%):

I've no idea why some games are working fine with libstrangle and mangohud at the same time (with the frame rate capped at 55 fps) but directly crash if I only use libstrangle. My steam_sys_info: https://gist.github.com/at46/51134de8e41a8aa24351b8a47a1344f4

P.S. Since Proton 6.3 was released maybe the title of the bug report should be changed to something like starting with version 5.13-1 Proton ignores Vulkan custom layers like frame limiter #295

RyuzakiKK commented 3 years ago

@at46 Can you please also provide the runtime log (slr*.log) of a game that fails to start? You can find how to obtain it here https://github.com/ValveSoftware/steam-runtime/blob/master/doc/reporting-steamlinuxruntime-bugs.md#essential-information

at46 commented 3 years ago

@RyuzakiKK I added the logs to my post above.

smcv commented 3 years ago

@at46, please could you open a separate issue for the crashes you are seeing, and be as specific as possible in its title? You can link to the gists and attachments you already provided rather than creating new ones. If libstrangle is causing a crash for you, then it is obviously not having the effect that you want it to have, but it's also definitely not being ignored, so the topic of this issue report doesn't really apply.

If we let the scope of this issue expand into "Vulkan layers don't work perfectly" then it will never be possible to close it, which is bad for everyone, because it will be increasingly difficult for us to find the information we need. The longer it takes us to find relevant information, the longer it will take to fix anything.

If you can reproduce the crash by forcing a free game like Life Is Strange to run under Proton, then that will help to debug this by making sure everyone can easily try the same thing.

Since some improvements were implemented in Mesa and the Soldier Runtime

I'll need to recheck this when I'm back at work, but I think we have already resolved the issue that was originally reported (layers not getting loaded at all), so I would really prefer it if anyone who is still having problems with layers opens new, separate issues with the full details.

at46 commented 3 years ago

@smcv I made #389 for my issue

kisak-valve commented 3 years ago

At this point it's reasonable to say that vulkan layers get imported into the container environment and any follow up issues need to be evaluated separately. Closing as fixed.