SveSop / wine-nvoptix

Relay for nvoptix to use with Wine
MIT License
5 stars 3 forks source link

Running OptiX SDK Samples with wine > 7.12 #29

Open SveSop opened 1 year ago

SveSop commented 1 year ago

It seems that OptiX SDK samples do not run when using wine-7.13 and newer for me. This has been discussed somewhat in this thread: https://github.com/Saancreed/wine-nvml/issues/13#issuecomment-1616806228 and is moved here for any continuing of this topic.

I can provide precompiled samples if interested in testing, but they HAVE to be installed in the WINEPREFIX at this exact location: C:\ProgramData\NVIDIA Corporation\OptiX SDK 7.x.x\ (where XX is the version of SDK).

Saancreed commented 1 year ago

I can provide precompiled samples if interested in testing, but they HAVE to be installed in the WINEPREFIX at this exact location: C:\ProgramData\NVIDIA Corporation\OptiX SDK 7.x.x\ (where XX is the version of SDK).

:pray:

Actually, I was able to find and dig up 7.4 SDK samples from here and they are still working for me with wine-tkg-staging-protonified-8.9.1. Here's a sample log from optixTriangle.exe when launched with WINEDEBUG='+nvcuda,+nvoptix': optixTriangle.log

No hangs as far as I can tell but I haven't checked every single sample.

SveSop commented 1 year ago

No hangs as far as I can tell but I haven't checked every single sample.

I have the same issue with that one too.. Standard wine-staging-8.8 from WineHQ repo hangs.. As do wine-staging-7.13 ... 7.12 works fine.

I wonder if that TKG/Protonified got some patches fixing stuff then.. Have not tried to build that, as i tried only the "tkg" version.. Care to share your customization.cfg and advanced-customization.cfg possibly so i can try build the same?

Saancreed commented 1 year ago

It's more or less this plus a few less interesting patches not included in TKG:

_NOCCACHE="true"
_NOINITIALPROMPT="true"
_LOCAL_PRESET="none"
_PKGNAME_OVERRIDE="none"
_GCC_FLAGS+=" -march=x86-64-v3"
_CROSS_FLAGS+=" -march=x86-64-v3"
_staging_commit='#tag=v8.9.1'
_user_deps+=" pcsclite lib32-pcsclite"
_user_makedeps+=" winesync-header"
_configure_userargs64+=" --without-wayland"
_configure_userargs32+=" --without-wayland"
_use_fastsync="true"
_use_vkd3dlib="true"
_proton_fs_hack="false"
_win10_default="true"
_protonify="true"
_proton_battleye_support="true"
_proton_eac_support="true"
_user_patches_no_confirm="true"
_hotfixes_no_confirm="false"
_allow_server_rt_prio="true"
_allow_wine_net_raw="true"

And on top of that: MR 2710, revert of 354a8bb1f4a65bdec052606f2799db9e2907b5b1, removal of UplayWebCore hack, removal of ntdll: Guard against syscall stack overrun. (because it was removed only from patchset applied to more recent revisions) and addition of rebased winex11-fs-no_above_state from community-patches.

SveSop commented 1 year ago

@Saancreed I could not really get this to patch cleanly from what i tried.. You would not happen to have the "prepare" log and the "wine-tkg-config.txt" file so i can compare a bit with my build?

Saancreed commented 1 year ago

I recently upgraded to 8.11 because I figured out my 32-bit application breakage: it was actually a compiler bug. So my new version of wine-tkg.conf now has _staging_commit='#tag=v8.11' and one new thing, _user_deps+=" libgcrypt lib32-libgcrypt" needed for some bcrypt patches. Still works fine for me here.

My changes to wine-tkg-git repo:

diff --git a/wine-tkg-git/wine-tkg-scripts/build.sh b/wine-tkg-git/wine-tkg-scripts/build.sh
index b4003bef..2a8c1f75 100644
--- a/wine-tkg-git/wine-tkg-scripts/build.sh
+++ b/wine-tkg-git/wine-tkg-scripts/build.sh
@@ -128,6 +128,7 @@ _build_serial() {
   if [ "$_NOLIB32" != "true" ] && [ "$_NOLIB32" != "wow64" ]; then
     # build wine 32-bit
     # nomakepkg
+    export CROSSCFLAGS+=' -mincoming-stack-boundary=2'
     if [ "$_nomakepkg_midbuild_prompt" = "true" ]; then
       msg2 '64-bit side has been built, 32-bit will follow.'
       msg2 'This is the time to install the 32-bit devel packages you might need.'
diff --git a/wine-tkg-git/wine-tkg-scripts/prepare.sh b/wine-tkg-git/wine-tkg-scripts/prepare.sh
index 049d9214..3c1510a6 100644
--- a/wine-tkg-git/wine-tkg-scripts/prepare.sh
+++ b/wine-tkg-git/wine-tkg-scripts/prepare.sh
@@ -865,6 +865,7 @@ _prepare() {
    if [ "$_NUKR" != "debug" ] && [ "$_unfrog" != "true" ] || [[ "$_DEBUGANSW1" =~ [yY] ]]; then
      if [ "$_LOCAL_PRESET" != "staging" ] && [ "$_LOCAL_PRESET" != "mainline" ]; then
        _userpatch_target="wine-mainline" _userpatch_ext="myearly" hotfixer
+       _userpatch_target="wine-mainline" _userpatch_ext="myearly" user_patcher
      fi
    fi

The first change is a workaround for the GCC bug, but it shouldn't be necessary if you get rid of my -march=x86-64-v3 flags, the second adds missing early patches support for user_patcher.

prepare.log

wine-tkg-config.txt

Maybe your nvcuda.dll needs to be ancient enough? Because I haven't updated mine in a long time now (and I built it with my own LUID patch).

SveSop commented 1 year ago

The first change is a workaround for the GCC bug, but it shouldn't be necessary if you get rid of my -march=x86-64-v3 flags, the second adds missing early patches support for user_patcher.

I build my wine source on the OBS build service that WineHQ uses for building debian packages, so i dont really use any particular gcc flags for that other than the "official" settings.. I just use the TKG patched sources as a base source and builds a separate "wine-tkg-custom" debian package instead of "winehq-staging" one that i install.

Afaik ubuntu 22.04 build bot seems to be using GCC-12 + Mingw-w64-10.3.0 so i am a bit unsure if this is relevant for that particular build? I see that it's Zebediah that posted the bug, so it is "wine related", but seemingly is for gcc-13? I'm willing to give it a try and see if i can pop that flag there and check.

[   17s] [7/710] installing gcc-12-base-12-20220319-1ubuntu1
[   17s] [8/710] installing gcc-mingw-w64-base-10.3.0-14ubuntu1+24.3

Will look into using older nvcuda incase something is messed up, but the same nvcuda version works fine with older wine versions, so i am not sure what it could be? Some long long uint size crap that messes with stack on newer "not so forgiving wine versions" or somecrap? 😄

SveSop commented 1 year ago

Tested building with the -mincoming-stack-boundary=2 flag, but that did not really do much. Even tho debian/ubuntu may backport patches from gcc-13 -> their gcc-12, it would probably just be if optimized towards something, but the .deb packages i build is just "generic -O2", so it makes sense it would not be a issue even if i had used gcc-13 i suppose.

Well.. Will keep on trying 👍

Saancreed commented 1 year ago

Since OptiX is 64-bit-only, you probably wouldn't encounter any issues related to that anyway because I'm assuming that all the samples must be 64-bit as well.

SveSop commented 1 year ago

I was looking into the previous GPU Caps Viewer issue to see if the 32-bit loading there had any relevance which it did not for me. The OptiX thing is probably something to do with nvcuda i suppose.

Currently dabbling in 3 different wine issues - OptiX (probably nvcuda), 32-bit GPU Caps Viewer (could be related to this 32-bit thing), and Using "Virtual Desktop" with wine-staging-8.10 or newer (currently also broken). So... hard for me to keep track of bug-report-threads 😢

Building "protonify" version of tkg-8.11 now, so ill see what comes up.. Did not use "fastsync" patch, and not sure what this 2710.mypatch is, but will add the revert and such and see if the protonify version got some added hacks that can help perhaps 😄

Saancreed commented 1 year ago

Did not use "fastsync" patch

Fwiw it's a patch to support winesync as an alternative to esync/fsync, see https://lore.kernel.org/lkml/f4cc1a38-1441-62f8-47e4-0c67f5ad1d43@codeweavers.com/T/ and https://repo.or.cz/linux/zf.git/shortlog/refs/heads/winesync4

But I suppose it shouldn't matter, I'd expect fsync to work fine too.

and not sure what this 2710.mypatch is

That one is just https://gitlab.winehq.org/wine/wine/-/merge_requests/2710 because I'm on Intel i9-13980HX right now.

SveSop commented 1 year ago

Oki. Summary of what i have tried so far:

The prepare log for my TKG-Protonify log compared to yours show these differences:

  1. Not using "Fastsync" patch
  2. Not using 2710.mypatch (using Intel 9700K here)
  3. Not using "Uplay.mypatch"

Same issue. Tested nvidia-libs-0.7.4 from back when i released it for DAZ Studio (100% know that was working). This is PRE wine-7.12. Does not make a difference with newer wine versions.

WINEPREFIX is set up with DXVK (from GIT) + nvidia-libs (old/new does not matter). ANY wine version after wine-7.12 is just hanging when starting OptiX samples - both updated samples i have compiled, and the one provided earlier that you test with. ALL samples are working perfectly fine with <= wine-staging-7.12, but breaks with anything newer than wine-(staging)-7.13 for me. Also tested usual suspects as adding vcrun2019 and d3dcompiler_47 incase something had changed for wine > 7.12, no dice.

Tested with WINEFSYNC=1, and WINEESYNC=1 and also default "server-side syncronization" - no dice.

Wine binaries are compiled on the OBS build server (same as WineHQ uses), for Ubuntu 22.04 and run on Ubuntu 22.04 - using whatever "generic -02" optimizations WineHQ uses for standard .deb packages.

If you use Arch and get ALL samples to run with wine-tkg-8.11-protonify, then i am at a completely loss to what it could be. (All samples will run for me with <= wine-staging-7.12 .. optixHair.exe, optixCutouts.exe +++

wine-tkg-8.11 = not working wine-tkg-8.12 = not working wine-tkg-8.11-protonify = not working ANY wine-staging release or otherwise > 7.12 = not working.

nvidia-libs "verified" released for DAZ Studio and OptiX SDK 7.3 works fine with wine-staging-7.12, but not anything newer.

Conf settings:

DXVK_LOG_LEVEL=info
DXVK_ENABLE_NVAPI=1
WINEFSYNC=1
STAGING_SHARED_MEMORY=1
WINE_LARGE_ADDRESS_AWARE=1
WINE_HIDE_NVIDIA_GPU=0

(switching these staging settings 0/1 does not matter)

dxvk.conf file containing "dxgi.nvapiHack = False"

Saancreed commented 1 year ago

Okay yeah, this is strange. I suppose we could start making some educated guesses if knew which Wine commit caused this, but it would probably require you to bisect this issue on your end. Does non-Staging 7.12 work? If so then it should make bisecting easier, because bisecting Staging only gives us a day's worth of commits between Staging rebase commits.

If bisecting is not an option, you could try acquiring a build of Wine > 7.12 with debug symbols and attaching gdb or winedbg to the app before/when it hangs and inspecting stack traces, which could give us an idea where to look for potential issues, but unfortunately I wouldn't expect this to be as useful as a bisect would.

By the way, I don't remember ever using Wine's virtual desktop feature, I don't even know what benefits does it bring.

SveSop commented 1 year ago

By the way, I don't remember ever using Wine's virtual desktop feature, I don't even know what benefits does it bring.

Some programs do not like linux VM's fiddling with the windows, so it sometimes breaks.. and running wine without letting VM "control windows" makes all kinds of crap when it comes to what window is on top and so on.. Good example is the "Elevator (star wars) demo", it is set to run 1920x1080, and if you have a monitor NOT using that exact resolution when running wine, it will bork.. so running a "Virtual Desktop" as 1920x1080 will force the demo to think the monitor resolution is 1920x1080 and run correctly fullscreen. DAZ Studio is also a weird one where you cant run the window fullscreen and each time you click render or whatnot, it will minimize +++ So.. usually just a bit misbehaving apps that have issues.

I was under the impression you had all OptiX samples running with wine-tkg-8.11-protonify ?

Saancreed commented 1 year ago

I was under the impression you had all OptiX samples running with wine-tkg-8.11-protonify ?

Well, to be precise:

Caught exception: GL interop is only available on display device, please use display device for optimal performance. Alternatively you can disable GL interop with --no-gl-interop and run with degraded performance.

(most likely because my display is driven by Intel GPU) but they show a window and work fine when started with --no-gl-interop argument as suggested

SveSop commented 1 year ago

What a horrible experience... Bisecting is not an exact science for sure 😏 Anyway.. found the culprit: https://gitlab.winehq.org/wine/wine/-/merge_requests/416 If you revert that MR (3 patches), i am able to run the OptiX samples on wine-tkg-8.11 "protonified" aswell..

OptiX samples only need wine, no staging or extra patches needed, and only nvcuda/nvoptix - no need for dxvk/nvapi++ in case you wanna do more testing.

Gonna open a issue on DXVK-NVAPI cos i found something interesting regarding nvapi when it comes to virtual desktop usage and that other patch here https://bugs.winehq.org/show_bug.cgi?id=55085 - Vulkan demo crashes only when nvapi is enabled.. if i disable it, the vulkan demo runs fine on virtual desktop for some reason 😮

Attaching two patches - the "Cladun X2" revert patch , and the Virtual Desktop patch. Reverts.zip

PS. This might be worth some experimentation perhaps? https://gitlab.winehq.org/wine/wine/-/commit/d328af75fee0f10dab67ab64fee78e2e2da7a447 Seems vague with a

Sometimes error event serial is zero. For example, NVIDIA driver may send X_GLXCopyContext errors with the event serial set to zero.

SveSop commented 1 year ago

It is actually THIS one causing the hang https://gitlab.winehq.org/wine/wine/-/merge_requests/416/diffs?commit_id=18ae96e5fb3cbbd53f1a022ba81203de6b431228

I guess locking the Display and not unlocking it until it "passes" some other function does NOT work for OptiX samples, so it just locks "forever" that way. Maybe this Cladun X2 game will pass some "genuine" XErrorEvent and continue, but when running OptiX samples i checked the error_code put out and it is in the "ignore list" in the ignore_error function. Just removing the lock makes it continue...

Why the need for a XDisplay lock + mutex lock? Would not the mutex "protect" the calling values even if the XDisplay is up? Seems somewhat not really needed to me tho...

Subject: [PATCH] Remove XLockDisplay

Remove XLockDisplay from the https://gitlab.winehq.org/wine/wine/-/merge_requests/416 MR
since this causes hang when running NVIDIA OptiX Samples
---
 dlls/winex11.drv/x11drv_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/dlls/winex11.drv/x11drv_main.c b/dlls/winex11.drv/x11drv_main.c
index c4d537d6ada..c1c49792905 100644
--- a/dlls/winex11.drv/x11drv_main.c
+++ b/dlls/winex11.drv/x11drv_main.c
@@ -262,7 +262,6 @@ static inline BOOL ignore_error( Display *display, XErrorEvent *event )
 void X11DRV_expect_error( Display *display, x11drv_error_callback callback, void *arg )
 {
     pthread_mutex_lock( &error_mutex );
-    XLockDisplay( display );
     err_callback         = callback;
     err_callback_display = display;
     err_callback_arg     = arg;
@@ -281,7 +280,6 @@ int X11DRV_check_error(void)
 {
     int res = err_callback_result;
     err_callback = NULL;
-    XUnlockDisplay( err_callback_display );
     pthread_mutex_unlock( &error_mutex );
     return res;
 }
--
2.34.1

Seems to also do the trick for wine-8.12 git....

SveSop commented 1 year ago

https://bugs.winehq.org/show_bug.cgi?id=55270

Saancreed commented 1 year ago

Interesting, it's possible that I was simply unaffected by this either because I'm using Wayland (so my X is actually Xwayland) or because I'm running a Prime/Optimus setup (so my primary GL driver is Iris and not Nvidia) but even on X11 session Nvidia X driver could have a different behavior from modesetting one that's used with Intel GPUs. No idea why though, especially when related MR acknowledges Nvidia's driver behavior in another commit.

SveSop commented 1 year ago

Interesting or not - does not seem like its going to be fixed anytime soon... https://bugs.winehq.org/show_bug.cgi?id=53428 Last comment from nVidia: https://forums.developer.nvidia.com/t/deadlock-possibly-caused-by-x-display-lock-inversion-in-glxcreatecontext/222074

Almost 1 year without anything...

Scratch that.. Thought it was fixed

Saancreed commented 1 year ago

Ah well, that's disappointing. But probably not much for us to act on, other than acknowledge that it's an issue coming from the driver we have little control over.

Saancreed commented 4 months ago

@SveSop I've noticed that one of the Wine commits you mentioned had been reverted quite recently: https://gitlab.winehq.org/wine/wine/-/merge_requests/5179

Does this fix the issue you were encountering? If yes, then we can probably close this as resolved. If not, you might want to comment on linked bug about the issue still persisting.

SveSop commented 4 months ago

Afaik wine-9.4 with this fix should be released this upcoming weekend, so i wanted to verify with that before closing this.

SveSop commented 4 months ago

@Saancreed Nope.. still no dice. It could be that the upstream revert fixes some other bug, but tested wine-staging-9.4 and i get the same issue. I will try to compile a custom version with the old workaround and see if it still fixes stuff.

EDIT: Yeah, used the old patch and now it works.. so not fixed upstream still.

---
 dlls/winex11.drv/x11drv_main.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/dlls/winex11.drv/x11drv_main.c b/dlls/winex11.drv/x11drv_main.c
index c4d537d6ada..c1c49792905 100644
--- a/dlls/winex11.drv/x11drv_main.c
+++ b/dlls/winex11.drv/x11drv_main.c
@@ -262,7 +262,6 @@ static inline BOOL ignore_error( Display *display, XErrorEvent *event )
 void X11DRV_expect_error( Display *display, x11drv_error_callback callback, void *arg )
 {
     pthread_mutex_lock( &error_mutex );
-    XLockDisplay( display );
     err_callback         = callback;
     err_callback_display = display;
     err_callback_arg     = arg;
@@ -281,7 +280,6 @@ int X11DRV_check_error(void)
 {
     int res = err_callback_result;
     err_callback = NULL;
-    XUnlockDisplay( err_callback_display );
     pthread_mutex_unlock( &error_mutex );
     return res;
 }
--