jp7677 / dxvk-nvapi

Alternative NVAPI implementation on top of DXVK.
MIT License
367 stars 32 forks source link

Daz Studio lacks some NvAPI methods for iRay rendering #64

Closed PetitMote closed 2 years ago

PetitMote commented 2 years ago

Hello there,

It’s been a while, but some update (from Daz Studio) broke iRay rendering on nvidia GPU through Wine. I’m only a recent user, so I don’t know how it used to work or if dxvk_nvapi was needed, but I think it used to work before the first releases of dxvk, when wine-staging was needed. Rendering works fine on CPU, but it’s much slower.

I’ve checked the logs and tried with and without dxvk_nvapi. Here are the logs from dxvk_nvapi:

---------- 2022-01-04 18:46:40 ----------
NvAPI_QueryInterface 0xad298d3f: Unknown function ID
DXVK-NVAPI v0.5-20-ge23d450 (DAZStudio.exe)
NVML loaded and initialized successfully
NvAPI Device: NVIDIA GeForce RTX 3060 (495.46.0)
NvAPI Output: \\.\DISPLAY1
NvAPI_Initialize: OK
NvAPI_QueryInterface 0x33c7358c: Unknown function ID
NvAPI_QueryInterface 0x593e8644: Unknown function ID
NvAPI_GetInterfaceVersionString: OK
NvAPI_EnumLogicalGPUs: OK
NvAPI_EnumPhysicalGPUs: OK
NvAPI_QueryInterface 0x1efc3957: Unknown function ID
NvAPI_EnumNvidiaDisplayHandle 0: OK
NvAPI_GetPhysicalGPUsFromDisplay: OK
NvAPI_QueryInterface NvAPI_GetAssociatedNvidiaDisplayName: Not implemented method
NvAPI_GetErrorMessage -3 (NVAPI_NO_IMPLEMENTATION): OK
NvAPI_EnumNvidiaDisplayHandle 1: End enumeration
NvAPI_EnumNvidiaUnAttachedDisplayHandle 0: End enumeration
NvAPI_QueryInterface NvAPI_GPU_GetBusType: Not implemented method
NvAPI_GetErrorMessage -3 (NVAPI_NO_IMPLEMENTATION): OK
NvAPI_GPU_GetFullName: OK
NvAPI_GPU_GetVbiosVersionString: OK
NvAPI_Initialize: OK
NvAPI_SYS_GetDriverAndBranchVersion: OK
NvAPI_EnumPhysicalGPUs: OK
NvAPI_EnumLogicalGPUs: OK
NvAPI_QueryInterface NvAPI_EnumTCCPhysicalGPUs: Not implemented method
NvAPI_GetErrorMessage -3 (NVAPI_NO_IMPLEMENTATION): OK
NvAPI_GetPhysicalGPUsFromLogicalGPU: OK

I’ve started by reading the logs from Daz Studio, and there isn’t more, but I can still provide the log file if needed. Also, when dxvk_nvapi is disabled, I get the error on the NvAPI_Initialize method (3 times), and not on these 3 methods.

I believe the issue is due to these methods not being implemented on dxvk_nvapi. I would sincerely love to help and try to add them, but I’m merely a python beginner and c++ looks like sorcery to me (and, well, we’re talking about some low level coding). But I’d be happy to help or try if someone can point me in the right direction.

Also, DazStudio physic simulation, dforce, used to work, but now it only says "no OpenCL 1.2 compatible device found". I have absolutely no idea if it’s related, there is no information in the logs about it, but well, maybe if we fix the NvAPI issue we’ll get something.

Thank you guys for your work, it’s still pretty amazing.

jp7677 commented 2 years ago

Could you please describe what do you mean with “broke”? Do you get an actual error message or anything that might hint what that application is missing? Did you also tried to spoof an AMD card in DXVK? The DXVK logs should tell you the effective setting.

Looking at the logs itself, I would say the application handles the missing methods just fine and the actual issue is somewhere else.

PetitMote commented 2 years ago

Hello, Sorry, it lacked indeed some precisions. So if I allow only the GPU in the render settings, it starts compiling the shaders, but it won’t render. Daz considers the render over and presents an empty image to save.

It may be better with some logs dazstudio-logs.txt dxvk-nvapi.log

After re-reading the logs, I’m not as sure as before… There is a cuda error.

jp7677 commented 2 years ago

Looking at those logs, I think this is the actual error:


The version of your CUDA driver is 0.0, but 11.0 is the required minimum

CUDA module initialization failed with error 'CUDA driver version is insufficient for CUDA runtime version' (0x23); iray photoreal can only run in CPU mode. Please update your NVIDIA driver (www.nvidia.com).

~So I guess the cuda implementation in wine-staging is no longer sufficient for this application.~

PetitMote commented 2 years ago

Yeah, that seems more like it. Although, that’s strange how it finds only a "0.0" version, and not an actual version of cuda.

jp7677 commented 2 years ago

Yeah, I agree, also looking at https://github.com/SveSop/nvidia-libs/blob/master/dlls/nvcuda/nvcuda.c#L1031 (not the actual wine staging source, that is here https://github.com/wine-staging/wine-staging/tree/master/patches/nvcuda-CUDA_Support), the driver version call is just passed through, so the question is indeed how that application determines the cuda version.

jp7677 commented 2 years ago

Could you take a look at wine logs with enabled nvcuda channel (and loaddll? This should give an indication if cuda is loaded at all.

PetitMote commented 2 years ago

Yep, I’ve tried looking at the actual version, but I’ve a hard time reading C++ + patch notes + I don’t understand how it works.

Here is the log. I found an error while importing nvcuda and nvcuvid:

0534:err:module:import_dll Loading library nvcuda.dll (which is needed by L"C:\\Daz 3D\\Applications\\64-bit\\DAZ 3D\\DAZStudio4\\libs\\iray\\nvcuvid_video_decoder.dll") failed (error c000000f).
0534:err:module:import_dll Loading library nvcuvid.dll (which is needed by L"C:\\Daz 3D\\Applications\\64-bit\\DAZ 3D\\DAZStudio4\\libs\\iray\\nvcuvid_video_decoder.dll") failed (error c000000f).

wine-log.txt

But I need to make sure there isn’t any dll from all my tries and retries!

jp7677 commented 2 years ago

Mmh, may be it is an idea to stub EnumTCCPhysicalGPUs in dxvk-nvapi, since this is related to cuda, I can take a look tomorrow.

jp7677 commented 2 years ago

Yep, I’ve tried looking at the actual version, but I’ve a hard time reading C++ + patch notes + I don’t understand how it works.

Here is the log. I found an error while importing nvcuda and nvcuvid:

0534:err:module:import_dll Loading library nvcuda.dll (which is needed by L"C:\\Daz 3D\\Applications\\64-bit\\DAZ 3D\\DAZStudio4\\libs\\iray\\nvcuvid_video_decoder.dll") failed (error c000000f).
0534:err:module:import_dll Loading library nvcuvid.dll (which is needed by L"C:\\Daz 3D\\Applications\\64-bit\\DAZ 3D\\DAZStudio4\\libs\\iray\\nvcuvid_video_decoder.dll") failed (error c000000f).

wine-log.txt

But I need to make sure there isn’t any dll from all my tries and retries!

Thanks. Do you actually have nvcuda in your wine environment? I think you can use GPU Caps Viewer in you prefix to get some info about cuda.

PetitMote commented 2 years ago

Sorry, I think I need to do a clean setup of Daz, because I might have a bit messed-up my environment :sweat_smile: Just a few minutes

PetitMote commented 2 years ago

Ok, got it! wine-log-2.txt

Now it loads, also I looked into the Daz log and found this:

2022-01-04 23:21:44.392 Initializing NVIDIA Iray...
2022-01-04 23:21:44.393 Iray [INFO] - API:DATABASE ::   0.0   API    db   info : Loaded "C:\Daz 3D\Applications\64-bit\DAZ 3D\DAZStudio4\libs\iray\libneuray.dll"
2022-01-04 23:21:44.393 Iray [INFO] - API:MISC ::   0.0   API    misc info : Iray RTX 2020.1.6, build 334300.9558, 27 Mar 2021, nt-x86-64
2022-01-04 23:21:44.393 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(480): Could not add path: "C:/users/timothee/AppData/Roaming/DAZ 3D/Studio4/shaders/iray". Due to unknown error -2
2022-01-04 23:21:44.393 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(480): Could not add path: "C:/users/timothee/AppData/Roaming/DAZ 3D/Studio4/temp/shaders/iray". Due to unknown error -2
2022-01-04 23:21:44.413 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - GPU:RENDER ::   0.0   GPU    rend error: NvAPI call NvAPI_GetAssociatedNvidiaDisplayName returned an error:
2022-01-04 23:21:44.413 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - GPU:RENDER ::   0.0   GPU    rend error:   NVAPI_NO_IMPLEMENTATION
2022-01-04 23:21:44.413 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - GPU:RENDER ::   0.0   GPU    rend error: NvAPI call NvAPI_GPU_GetBusType returned an error:
2022-01-04 23:21:44.414 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - GPU:RENDER ::   0.0   GPU    rend error:   NVAPI_NO_IMPLEMENTATION
2022-01-04 23:21:44.414 Iray [INFO] - GPU:RENDER ::   0.0   GPU    rend info : Found 1 GPU with vendor's API.
2022-01-04 23:21:44.438 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [WARNING] - CUDA:RENDER ::   0.0   CUDA   rend warn : CUDA module initialization failed.
2022-01-04 23:21:44.438 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [WARNING] - CUDA:RENDER ::   0.0   CUDA   rend warn : cudaRuntimeGetVersion returned with error 'initialization error'
2022-01-04 23:21:44.438 NVIDIA Iray Rendering Configuration:
2022-01-04 23:21:44.438     CPU Fallback: enabled
2022-01-04 23:21:44.438     GPU Detection: enabled
2022-01-04 23:21:44.438     GPU Driver Version Check: enabled
2022-01-04 23:21:44.438 Loading NVIDIA Iray Plugins...
[…]
2022-01-04 23:21:44.658 Iray [INFO] - IRAY:RENDER ::   1.1   IRAY   rend info : NVIDIA display driver version: 495.46
2022-01-04 23:21:44.658 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.1   IRAY   rend warn : CUDA module initialization failed with error 'initialization error' (0x3); iray photoreal can only run in CPU mode. Please update your NVIDIA driver (www.nvidia.com).
2022-01-04 23:21:44.658 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - IRAY:RENDER ::   1.1   IRAY   rend error: NvAPI call NvAPI_EnumTCCPhysicalGPUs returned an error:
2022-01-04 23:21:44.659 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [ERROR] - IRAY:RENDER ::   1.1   IRAY   rend error:   NVAPI_NO_IMPLEMENTATION
2022-01-04 23:21:44.659 Iray [INFO] - IRAY:RENDER ::   1.1   IRAY   rend info : Using iray plugin version 5.1, build 334300.9558 n, 27 Mar 2021, nt-x86-64-vc14.
2022-01-04 23:21:44.659 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.1   IRAY   rend warn : There is no CUDA-capable GPU available to the iray photoreal renderer.
2022-01-04 23:21:44.659 Iray [VERBOSE] - IRAY:RENDER ::   1.1   IRAY   rend stat : Environment cache size capacity: 5.
[…]
2022-01-04 23:21:44.661 NVIDIA Iray GPUs:
2022-01-04 23:21:44.661     GPU: 1 - 
2022-01-04 23:21:44.661     Memory Size: -1 bytes
2022-01-04 23:21:44.661     Clock Rate: -1 kHz
2022-01-04 23:21:44.661     Multi Processor Count: -1
2022-01-04 23:21:44.661     CUDA Device ID: -1
2022-01-04 23:21:44.661     CUDA Compute Capability: NA
2022-01-04 23:21:44.661     PCI Bus ID: -1
2022-01-04 23:21:44.661     PCI Device ID: 260
2022-01-04 23:21:44.661     TCC Mode: disabled
2022-01-04 23:21:44.661     Display: attached
2022-01-04 23:21:44.662 WARNING: ..\..\..\..\..\src\pluginsource\DzIrayRender\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.0   IRAY   rend warn : There is no CUDA-capable GPU available to the iray photoreal renderer.
2022-01-04 23:21:44.662 Iray [INFO] - IRT:RENDER ::   1.0   IRT    rend info : Resource assignment for host 0 has changed.
2022-01-04 23:21:44.662 NVIDIA Iray Scheduling Configuration:
2022-01-04 23:21:44.662     CPU Load Limit: 12
2022-01-04 23:21:44.662     CPU Thread Affinity: disabled
2022-01-04 23:21:44.662     GPU Load Limit: 1

Obviously, I had fucked up my wine prefix :grimacing: I guess it showed "0.0" because there was some formatting around the get version function. But it still won’t work, sadly :(

PetitMote commented 2 years ago

Ok, maybe we did all this just to be stuck at the same point we were before. According to Cuda-z, the runtime dll version is 6.5, which means a dinosaur.

image

Well, still, thank you very much! It was really kind of you to help me.

Saancreed commented 2 years ago

@PetitMote That 6.50 you see in your screenshot is just the version of the library CUDA-Z was linked to (obtained using cudaRuntimeGetVersion), it will always report 6.50 unless you recompile and link it with newer runtime libraries because that's the version the latest binary release of CUDA-Z ships with.

Fwiw this is what CUDA-Z says on my machine, on the left is the Linux version and on the right is the Windows one running in Wine.

image

So your problem is most likely related to application failing to load Wine's nvcuda.dll and nvcuvid.dll. Actually, after taking a look at your latest log, it seems that application loads the library correctly, but fails to use it. But it's hard to say if that's because of DXVK-NVAPI missing implementation of some functions or Wine's CUDA library not supporting newer interfaces.

PetitMote commented 2 years ago

Hello again, Indeed, I agree with all you said. In fact, I checked Daz Studio requirements, and they only say it needs 2.0 Compute Capability. I’m guessing there might lack a function, or there could be a bug, either in wine’s cuda or dxvk-nvapi. Or, maybe there is a fundamental difference between Cuda windows and Cuda Linux? Either way, I need more logs to know about it. Do you know to get them? There isn’t a nvcuda channel for WINEDEBUG, so I don’t know to get more informations? Also, I’ll try to get in touch with Daz Studio to see if they can give a debug version of their software.

Saancreed commented 2 years ago

There isn’t a nvcuda channel for WINEDEBUG

Actually there is, which is why we can see the following in the log:

04d0:trace:nvcuda:DllMain (0x7f0798320000, 1, (nil))
04d0:trace:nvcuda:wine_cuDriverGetVersion (0x145103e0)
04d0:trace:nvcuda:wine_cuInit (0)
04d0:trace:nvcuda:wine_cuGetExportTable (0x145103b8, 0x144e53c0)
04d0:fixme:nvcuda:cuda_check_table WARNING: Your CUDA version supports a newer interface for Unknown1 then the Wine implementation.
04d0:trace:nvcuda:wine_cuGetExportTable (0x145103c0, 0x144e53a0)
04d0:trace:nvcuda:wine_cuDeviceGetCount (0x145189c0)
04d0:trace:nvcuda:wine_cuDeviceGet (0x101ee80, 0)
04d0:trace:nvcuda:Unknown1_func1_relay (0x14518be8, (nil))
04d0:trace:nvcuda:wine_cuDeviceGetName (0x14518c28, 256, 0)
04d0:trace:nvcuda:wine_cuDeviceTotalMem_v2 (0x14518d48, 0)
04d0:trace:nvcuda:wine_cuDeviceGetAttribute (0x14518d90, 75, 0)
…

But it's not enough to tell if Wine has all the stuff Daz Studio needs. I think implementing NvAPI_GetAssociatedNvidiaDisplayName, NvAPI_GPU_GetBusType and NvAPI_EnumTCCPhysicalGPUs could maybe tell us more. Or attempting to use DXVK and DXVK-NVAPI with Nvidia's nvcuda.dll on Windows somehow, assuming we can make it load our own nvapi64.dll to see if CUDA still works.

PetitMote commented 2 years ago

Ok, thank you. There isn’t, by any chance, a nvapi channel ? I can try installing nvidia driver on wine, or if someone has the dll at hand…

(here is a log with only nvcuda) wine-log-3-nvcuda.txt

Also, I tried installing Nvidia Runtime and replacing cudart dll from Daz Studio by the one from nvidia (well, I tried something), but it changed absolutely nothing.

Saancreed commented 2 years ago

There isn’t, by any chance, a nvapi channel ?

There is, but it will only function with Wine Staging's nvapi implementation. When it comes to DXVK-NVAPI, I think we have all the logs we need for now.

As the last resort, you could try reporting this as a Wine Staging bug to WineHQ, but when doing so, make sure you are not using DXVK nor DXVK-NVAPI and your build of Wine has no custom patches other than Staging itself (so nvapi comes from Wine Staging), but if the application really needs nvapi to work then I expect it to give up even earlier than it does now. Just something to consider if we can't figure out what exactly does it need, maybe Wine wizards will be able to help in this case.

PetitMote commented 2 years ago

I’ve tried, and I do think it’s worse. According the logs, my GPU is reported as an AMD GPU, therefore it doesn’t event try to load Cuda on it, and it doesn’t appear in the render settings. (didn’t say it before, but my GPU appears as "CUDA 4294967295" in Daz, even if I’m not sure that the number doesn’t change)

As we can see in wine logs, it reports the GPU as AMD, and in DazStudio at 2022-01-05 11:18:33.584 there isn’t any GPU: dazstudio-logs-2-staging-nvapi.txt wine-log-4-staging-nvapi.txt

Does this mean it could be caused by the nvapi? How could I help implementing these methods? I’ll compile and try them on my computer (I’ll start downloading Clion)

Also, I’ve already filed a bug report on Wine, they told there is a new cuda version coming in the next wine-staging release.

jp7677 commented 2 years ago

@PetitMote Could you please test with the binaries from https://github.com/jp7677/dxvk-nvapi/pull/65 ? I've (relatively quick and cheap) added implementations for GetBusType and GetAssociatedNvidiaDisplayName. Lets see what the app is doing now.

Edit: The build container for building on GitHub don't want to come up (probably upstream arch issue, will certainly be fixed soon). Let me know if I should provide binaries here.

PetitMote commented 2 years ago

Hello, There is an upcoming update to wine nvcuda on staging, but so far it didn’t fix the issue.

I’ve tried without and with the patch. Here are the dxnvk-nvapi logs. We still get problems about unimplemented methods. dxvk-nvapi.log

PetitMote commented 2 years ago

I’ve checked the Daz logs, and it doesn’t impact the issue, still get the:

CUDA module initialization failed.
cudaRuntimeGetVersion returned with error 'initialization error'

At least with dxvk-nvapi it tries, because it won’t bother with wine-staging nvapi.

jp7677 commented 2 years ago

Thanks, I've added implementations with hard-coded values for the still missing methods. Lets hope that the app wanting more entry points stops at some point :) Edit: clarified sentence.

PetitMote commented 2 years ago

Hello again! So, there isn’t any more error! And there is some progress, now Daz knows the name of my graphic card. Sadly, it didn’t fix my Cuda issue. dxvk-nvapi.log

Do you know if these 2 errors might cause an issue?

NvAPI_QueryInterface 0x33c7358c: Unknown function ID
NvAPI_QueryInterface 0x593e8644: Unknown function ID
jp7677 commented 2 years ago

As far as I know those two are entry points for debugging, it would be very weird if the app depends on it. I also don’t think that the last missing entry point is an actual issue.

So bottom line I guess is, that the cuda issue is unfortunately not related to dxvk-nvapi*. I guess waiting for the updated cuda wine-staging patch set is probably your best bet for now. Please let me know if that makes a difference.

PetitMote commented 2 years ago

I can’t find any information on this 2 method indexes, nvapi doc is either absent or very poorly referenced.

I’ve already tried the patched wine-staging, and it changed nothing. I’ll try to get more information. I’ll keep you updated, if you want :)

Well, anyway, thank you very very much. I greatly appreciated your help! I don’t think I can be very useful, but if I can be of any help, I’m available!

(Well, to be precise, there is still this weird issue. Your last updates did improve the information in this, but some are still lacking.)

> NVIDIA Iray GPUs:
> GPU: 1 - NVIDIA GeForce RTX 3060
> Memory Size: 12.2 GB
> Clock Rate: -1 kHz
> Multi Processor Count: -1
> CUDA Device ID: -1
> CUDA Compute Capability: NA
> PCI Bus ID: 1
> PCI Device ID: 0
> TCC Mode: disabled
jp7677 commented 2 years ago

Yeah, CUDA device ID and compute capabilities doesn’t look correct. Though unfortunately dunno where those are coming from. Likely this is the result of a failed cuda initialization for the application.

That said, as Saancreed already said above, looking at you nvcuda wine trace from above. this looks actually good:

04bc:trace:nvcuda:DllMain (0x7f7e2c570000, 1, (nil))
04bc:trace:nvcuda:wine_cuDriverGetVersion (0x14d903e0)
04bc:trace:nvcuda:wine_cuInit (0)
04bc:trace:nvcuda:wine_cuGetExportTable (0x14d903b8, 0x14d653c0)
04bc:fixme:nvcuda:cuda_check_table WARNING: Your CUDA version supports a newer interface for Unknown1 then the Wine implementation.
04bc:trace:nvcuda:wine_cuGetExportTable (0x14d903c0, 0x14d653a0)
04bc:trace:nvcuda:wine_cuDeviceGetCount (0x14d989c0)
04bc:trace:nvcuda:wine_cuDeviceGet (0x101ee80, 0)
04bc:trace:nvcuda:Unknown1_func1_relay (0x14d98be8, (nil))
04bc:trace:nvcuda:wine_cuDeviceGetName (0x14d98c28, 256, 0)
04bc:trace:nvcuda:wine_cuDeviceTotalMem_v2 (0x14d98d48, 0)
...

Edit: looking again at the DAZ logs, cudaRuntimeGetVersion returned with error 'initialization error' actually looks very specific and matches a possible outcome as specified here https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART____VERSION.html#group__CUDART____VERSION

Edit2: Is cudaRuntimeGetVersion implemented in wine's cuda? Ah, wait, according to Sancreed that method comes from link time.

PetitMote commented 2 years ago

Hello, I hadn’t seen your edit, here I am.

The theory we have on the wine issue is that cudart, the Cuda Runtime library, is only a toolkit to be used by developpers, and makes calls to nvcuda. That seems logical given the fact that installing cuda on Linux gives only the libcuda.so, and you need to install cuda-toolkit to have a cudart version, but not in /usr/lib, only in /opt/cuda.

Also, by watching the logs, I saw this:

### It launches the Daz Studio ###
NvAPI_QueryInterface 0xad298d3f: Unknown function ID
DXVK-NVAPI v0.5.1-4-gaa65a65 (DAZStudio.exe)
info:  Game: DAZStudio.exe
info:  DXVK: v1.9.2-21-g2e66f45a

### It detects the GPU ###
info:  NVIDIA GeForce RTX 3060:
info:    Driver: 495.46.0
info:    Vulkan: 1.2.186
info:    Memory Heap[0]: 
info:      Size: 12288 MiB
info:      Flags: 0x1
info:      Memory Type[7]: Property Flags = 0x1

### NvAPI calls ###
NVML loaded and initialized successfully
NvAPI Device: NVIDIA GeForce RTX 3060 (495.46.0)
NvAPI Output: \\.\DISPLAY1
NvAPI_Initialize: OK
NvAPI_QueryInterface 0x33c7358c: Unknown function ID
NvAPI_QueryInterface 0x593e8644: Unknown function ID
NvAPI_GetInterfaceVersionString: OK
NvAPI_EnumLogicalGPUs: OK
NvAPI_EnumPhysicalGPUs: OK
NvAPI_QueryInterface 0x1efc3957: Unknown function ID
NvAPI_EnumNvidiaDisplayHandle 0: OK
NvAPI_GetPhysicalGPUsFromDisplay: OK
NvAPI_GetAssociatedNvidiaDisplayName: OK
NvAPI_EnumNvidiaDisplayHandle 1: End enumeration
NvAPI_EnumNvidiaUnAttachedDisplayHandle 0: End enumeration
NvAPI_GPU_GetBusType: OK
NvAPI_GPU_GetCurrentPCIEDownstreamWidth: OK
NvAPI_GPU_GetFullName: OK
NvAPI_GPU_GetVbiosVersionString: OK
NvAPI_GPU_GetPhysicalFrameBufferSize: OK
NvAPI_GPU_GetThermalSettings: OK
NvAPI_GPU_GetBusId: OK
NvAPI_GPU_GetBusSlotId: OK
NvAPI_SYS_GetDriverAndBranchVersion: OK
NvAPI_GetInterfaceVersionString: OK

### NvCuda calls ###
04b4:trace:nvcuda:DllMain (0x7f63545c0000, 1, (nil))
04b4:trace:nvcuda:wine_cuDriverGetVersion (0x14d903e0)
04b4:trace:nvcuda:wine_cuInit (0)
04b4:trace:nvcuda:wine_cuGetExportTable (0x14d903b8, 0x14d653c0)
04b4:fixme:nvcuda:cuda_check_table WARNING: Your CUDA version supports a newer interface for Unknown1 then the Wine implementation.
04b4:trace:nvcuda:wine_cuGetExportTable (0x14d903c0, 0x14d653a0)
04b4:trace:nvcuda:wine_cuDeviceGetCount (0x14d989c0)
04b4:trace:nvcuda:wine_cuDeviceGet (0x101ee80, 0)
04b4:trace:nvcuda:Unknown1_func1_relay (0x14d98be8, (nil))
04b4:trace:nvcuda:wine_cuDeviceGetName (0x14d98c28, 256, 0)
04b4:trace:nvcuda:wine_cuDeviceTotalMem_v2 (0x14d98d48, 0)
04b4:trace:nvcuda:wine_cuDeviceGetAttribute (0x14d98d90, 75, 0)
### A lot more GetAttribute ###
04b4:trace:nvcuda:wine_cuDeviceGetUuid (0x14d98d28, 0)
04b4:trace:nvcuda:wine_cuDeviceGetLuid (0x14d98d38, 0x14d98d40, 0)

### The error occurs ###
04b4:trace:nvcuda:DllMain (0x7f63545c0000, 0, (nil))
04b4:trace:seh:dispatch_exception code=40010006 flags=0 addr=000000007B0123AE ip=000000007B0123AE tid=04b4
04b4:warn:seh:dispatch_exception "WARNING: ..\\..\\..\\..\\..\\src\\pluginsource\\DzIrayRender\\dzneuraymgr.cpp(359): Iray [WARNING] - CUDA:RENDER ::   0.0   CUDA   rend warn : CUDA module initialization failed.\n"

### Then we have the Cuda Runtime error ###
04b4:warn:seh:dispatch_exception "WARNING: ..\\..\\..\\..\\..\\src\\pluginsource\\DzIrayRender\\dzneuraymgr.cpp(359): Iray [WARNING] - CUDA:RENDER ::   0.0   CUDA   rend warn : cudaRuntimeGetVersion returned with error 'initialization error'\n"

### Then we have a lot of these ###
04b4:trace:nvcuda:DllMain (0x7f63545c0000, 1, (nil))
04b4:trace:nvcuvid:DllMain (0x7f6354590000, 1, (nil))
0534:trace:nvcuda:DllMain (0x7f63545c0000, 2, (nil))
0534:trace:nvcuda:cuda_process_tls_callbacks (2)
0538:trace:nvcuda:DllMain (0x7f63545c0000, 2, (nil))

### It tries cuInit again ###
0534:trace:nvcuda:wine_cuInit (0)
0534:trace:nvcuda:wine_cuInit (0)

### NvAPI calls and error again ###
NvAPI_Initialize: OK
NvAPI_SYS_GetDriverAndBranchVersion: OK
0534:trace:seh:dispatch_exception code=40010006 flags=0 addr=000000007B0123AE ip=000000007B0123AE tid=0534
0534:warn:seh:dispatch_exception "WARNING: ..\\..\\..\\..\\..\\src\\pluginsource\\DzIrayRender\\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.1   IRAY   rend warn : CUDA module initialization failed with error 'initialization error' (0x3); iray photoreal can only run in CPU mode. Please update your NVIDIA driver"...
NvAPI_EnumPhysicalGPUs: OK
NvAPI_EnumLogicalGPUs: OK
NvAPI_EnumTCCPhysicalGPUs: OK
NvAPI_GetPhysicalGPUsFromLogicalGPU: OK
0534:warn:seh:dispatch_exception "WARNING: ..\\..\\..\\..\\..\\src\\pluginsource\\DzIrayRender\\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.1   IRAY   rend warn : There is no CUDA-capable GPU available to the iray photoreal renderer.\n"

### Then we get some of these calls ###
0568:trace:nvcuda:DllMain (0x7f63545c0000, 3, (nil))
0568:trace:nvcuda:cuda_process_tls_callbacks (3)

### This error again ###
04b4:warn:seh:dispatch_exception "WARNING: ..\\..\\..\\..\\..\\src\\pluginsource\\DzIrayRender\\dzneuraymgr.cpp(359): Iray [WARNING] - IRAY:RENDER ::   1.0   IRAY   rend warn : There is no CUDA-capable GPU available to the iray photoreal renderer.\n"

### A lot more Dllmain ###
058c:trace:nvcuda:DllMain (0x7f63545c0000, 2, (nil))
058c:trace:nvcuda:cuda_process_tls_callbacks (2)

What’s interesting is that the Dllmain have a (nil) instead of the device id (should be 0). But mostly, we can see the calls to nvcuda, and the fact that it tries multiple times to call cuInit before saying there is an error. I’m investigating this.

Edit: So, I tried a little modification that did nothing (the parameter wasn’t name after the documentation)

Now that I think of it, this message is concerning:

04b4:trace:nvcuda:Unknown1_func1_relay (0x14d98be8, (nil))

But I can’t make sens of the code it implies.

SveSop commented 2 years ago

@PetitMote

Now that I think of it, this message is concerning:

04b4:trace:nvcuda:Unknown1_func1_relay (0x14d98be8, (nil))

But I can’t make sens of the code it implies.

I think we better off by discussing here rather than on bugzilla, cos the wine devs are not really too interested in off-projects like dxvk-nvapi.

The function you ask about here is somewhat of a "mystery" when looking at the code, and is named "Unknown1" for a reason. I have experimented a bit with this before without getting much wiser about it.. but my guess is that this is ALSO some unknown stuff (nVidia NDA) in Cuda in the same manner loads of stuff is in nvapi.

nvcuda does not log calls "that does not exist" in the same manner as nvapi will throw a "unknown function" in the logs if an app tries to use something not there, hence the latest addition to staging nvcuda made DAZ crash on the two functions i added in my posted patch. I think it MAY be more stuff needing to be added, so i will look into seeing if i can add some stub's from Cuda 11 maybe, and see if it crashes on something more.

Adding stubs has its drawbacks, since just adding a lot of them WILL make things crash (as we proved it did)... but the upside is that we can figure out what function it crashes on.

Adding MOST of the "relay" functions (i like to call them that) is kinda trivial, even tho it takes some time to do it. Some relay functions requires additional stuff tho, especially if the case is different uuid's between dxvk's DXGI vs. Linux native -> Linux libcuda.

This requires some additional testing i guess.

So, in the meantime we can keep testing and suggesting patches here instead of cluttering bugzilla, since the issue will be dropped rather fast by the winehq dev's once we start commenting dxvk-nvapi there i think.

If we get this solved i will consider suggesting a staging patch for nvapi, since it lacks a couple of the calls needed for this to work with staging version of nvapi.

PetitMote commented 2 years ago

Thank you, That was my thought too. Except, I still see this as an issue in nvcuda and not nvapi, is it not ? From what we can see, it would be a missing implementation in nvcuda? I’ve been looking into the cuda headers to try and fin a missing function in wine nvcuda. If I’m not wrong, it should be one with only one parameter ? (as the unknown function relay only one parameter, and the other as nil).

As you say, it is very long (and boring). I’m still looking over cuda.h

SveSop commented 2 years ago

I think the current status of nvapi seems "ok", but the problem is that staging implementation of nvapi will not get us this far when it comes to using DAZ. So, if we keep posting new issues with nvcuda on bugzilla without any reliable "this works", there is no way for the WineHQ devs to test perhaps. Maybe i am just being overly pessimistic here, but if someone post issues that in any way seems to need dxvk, dxvk-nvapi or vkd3d-proton on bugzilla, it tends to end up in a "wontfix" situation.. so i rather try to figure stuff out a bit more here before asking too much.

@jp7677 My old implementation of GetBusType needed nvidia-settings, so it is a no-go, and it seems as nvml does not have something to actually check this "type". However, there is this: nvmlDeviceGetCurrPcieLinkGeneration . Maybe it could be checked and "if not" it will fallback to reporting PCI or AGP? Then again.. If you dont have nvml lib installed, it would also fall back.. Probably just fine to hardcode it to pcie anyway, since running any sort of game that would make you need dxvk-nvapi at all would mean you probably use a pcie adapter? 😄

PetitMote commented 2 years ago

@SveSop I’ve built a little system to check the functions from /opt/cuda/include/somefile.h. It’s very ugly, I’m using LibreOffice Calc, but it’s still quicker than searching inside nvcuda.spec :sweat_smile: I’ll post the list of unimplemented functions when I’m done. I’m looking into the files referenced here: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#driver-function-typedefs

PetitMote commented 2 years ago

Here are all the stubs from the missing functions: stubs.txt

I couldn’t use the D3D headers, since there are none on linux

EDIT: doesn’t change anything. I’ll try again with some more functions, I think there are some other headers I can use. EDIT 2: I’ve put all stubs I could get (and edited the file), and still no progress

SveSop commented 2 years ago

I am a bit sceptical to the number before my adapter name when i am in DAZ.. On Windows (where this works), it is 0, but under wine it seems to be either some address or some pointer. I wonder if this is some uuid thing that does not work as it should (from one i "implemented" before perhaps). :weary:

PetitMote commented 2 years ago

It could be! The GetLuid is the last function called before the inititialization error. I had done the same implementation as you, I think, and it crashed all the same.

PetitMote commented 2 years ago

Ok, according to this article, an LUID should be a Windows feature… Am I missing something? https://www.microsoftpressstore.com/articles/article.aspx?p=2224373&seqNum=7

EDIT: Looks like another one was wondering too https://github.com/tmcdonell/cuda EDIT2: I really can’t get the +relay channel to work, whenever I use my lutris crashes and wine processes don’t appear. If you know the right parameters to make it work…

jp7677 commented 2 years ago

It could be! The GetLuid is the last function called before the inititialization error. I had done the same implementation as you, I think, and it crashed all the same.

Where did you see a call to GetLuid?

SveSop commented 2 years ago

I find it somewhat odd that NvAPI_GPU_CudaEnumComputeCapableGpus is not called from DAZ. "All" other cuda capable demos and whatnot seem to need to call this function to get the ID of the cuda device... or by all means - need something else to get this id.

0100:trace:nvcuda:wine_cuDeviceGetUuid (0x15a68d28, 0)
0100:trace:nvcuda:wine_cuDeviceGetAttribute (0x15a68ef0, 106, 0)
0100:trace:nvcuda:wine_cuDeviceGetAttribute (0x15a68ef4, 109, 0)
0100:trace:nvcuda:wine_cuDeviceGetAttribute (0x101ee88, 111, 0)
0100:trace:nvcuda:wine_cuDeviceGetLuid (0x15a68d38, 0x15a68d40, 0)

Used when starting DAZ

jp7677 commented 2 years ago

It would be interesting to know if the returned Luid from wine-cuda matches the Luid obtained from other sources (D3D, Vulkan, NVAPI etc.). The LUID is the way to link the same physical GPU between different graphics API's, so no wrongdoing there.

SveSop commented 2 years ago

Hmm.. Is LUID a "static" thing tho? "Locally" UID? Have not really a clue how to obtain this anyplace.. mostly it is UUID and GUID

The allocated LUID is unique to the local system only, and uniqueness is guaranteed only until the system is next restarted. https://docs.microsoft.com/en-us/windows/win32/api/securitybaseapi/nf-securitybaseapi-allocatelocallyuniqueid

Dunno if that is "it" tho?

PetitMote commented 2 years ago

I know nothing of C/C++, but maybe it’s doable to put the Luid in the logs?

We also have an issue with the OpenCL simulation, if you wanna change a little bit :smile:

jp7677 commented 2 years ago

you can get the adapter luid by running vulkaninfo and/or the dxvk-nvapi tests executable. For wine-nvcuda, you have to add a log statement to the wine sources.

PetitMote commented 2 years ago

So, I changed the GetLuid function to:

CUresult WINAPI wine_cuDeviceGetLuid(char *luid, unsigned int *deviceNodeMask, CUdevice dev)
{
    TRACE("(%p, %p, %d)\n", luid, deviceNodeMask, dev);
    auto error = pcuDeviceGetLuid(luid, deviceNodeMask, dev);
    TRACE("LUID ? (%d)", *luid);
    return error;
}

And I get:

0630:trace:nvcuda:wine_cuDeviceGetLuid (0x15e98d38, 0x15e98d40, 0)
0630:trace:nvcuda:wine_cuDeviceGetLuid LUID ? (0)(0x7f75106e0000, 0, (nil))

Where with nvapi-tests I get:

LUID high part:    0x00000000
LUID low part:    0x000003f0

Also, I took a look at wine-staging nvapi, is it normal that the nvapi.spec is almost empty?

@ cdecl nvapi_QueryInterface(long)
@ stub DllCanUnloadNow
@ stub DllGetClassObject
@ stub DllRegisterServer
@ stub DllUnregisterServer
SveSop commented 2 years ago

Also, I took a look at wine-staging nvapi, is it normal that the nvapi.spec is almost empty?

@ cdecl nvapi_QueryInterface(long)
@ stub DllCanUnloadNow
@ stub DllGetClassObject
@ stub DllRegisterServer
@ stub DllUnregisterServer

Yes, this is correct. All the "calls" we make go through internal usage of nvapi_QueryInterface(0x12345678) where 0x12345678 is the address we use for the actual function. Why you say? Oh.. that is because half of nvapi is NDA-Secret-Sauce that they do not want ppl to get without paying $$$ :smile:

SveSop commented 2 years ago

Not sure if you knew this, but it seems DAZ release 4.15.1.86 and newer (currently i got DAZ 4.16.0.3 installed) needs

Update to NVIDIA Iray 2021.0.2 (344800.7839)

    Minimum driver is 465.89 on Windows for CPU-only rendering
    Minimum driver is 471.41 on Windows for GPU rendering

http://docs.daz3d.com/doku.php/public/software/dazstudio/4/change_log

If this in any way hinges on having CUDA according to driver rev, it means CUDA 11.4 Update 1 or newer i guess. It could ofc just be a "safe bet" they are putting out, but since this is from updating the Iray renderer, i would think it could mean possibly something from the CUDA 11.4+ library is needed.

PetitMote commented 2 years ago

Well, it doesn’t change much I think, I tried implementing every missing cuda driver API as stubs from the cuda toolkit 11.5.1. (Also, the bug’s been there since Daz 4.12 according to other Linux users)

But maybe there is something with a function returning a driver version? However, I don’t think it would crash the runtime initialization.

jp7677 commented 2 years ago
CUresult WINAPI wine_cuDeviceGetLuid(char *luid, unsigned int *deviceNodeMask, CUdevice dev)
{
    TRACE("(%p, %p, %d)\n", luid, deviceNodeMask, dev);
    auto error = pcuDeviceGetLuid(luid, deviceNodeMask, dev);
    TRACE("LUID ? (%d)", *luid);
    return error;
}

The luid here is a char*, from my limited c knowledge it should be %s. See also https://wiki.winehq.org/Wine_Developer%27s_Guide/Debug_Logging#Helper_functions , may be one of the helpers is useful here to get valid output.

PetitMote commented 2 years ago

That’s strange. I compiled with %s, and I get warning that it should be %d since it’s an int?

../../wine/dlls/nvcuda/nvcuda.c: Dans la fonction « wine_cuDeviceGetLuid »:
../../wine/dlls/nvcuda/nvcuda.c:1008:10: attention: le type « int » est utilisé par défaut dans la déclaration de « error » [-Wimplicit-int]
 1008 |     auto error = pcuDeviceGetLuid(luid, deviceNodeMask, dev);
      |          ^~~~~
../../wine/dlls/nvcuda/nvcuda.c:1008:5: attention: le C90 ISO interdit les mélanges de déclarations et de code [-Wdeclaration-after-statement]
 1008 |     auto error = pcuDeviceGetLuid(luid, deviceNodeMask, dev);
      |     ^~~~
Dans le fichier inclus depuis ../../wine/dlls/nvcuda/nvcuda.c:33:
../../wine/dlls/nvcuda/nvcuda.c:1009:11: attention: format « %s » attend un argument de type « char * » mais l'argument 5 a le type « int » [-Wformat=]
 1009 |     TRACE("LUID ? (%s)", *luid);
      |           ^~~~~~~~~~~~~  ~~~~~
      |                          |
      |                          int
../../wine/include/wine/debug.h:89:49: note: dans la définition de la macro « __WINE_DBG_LOG »
   89 |     wine_dbg_log( __dbcl, __dbch, __FUNCTION__, args); } } while(0)
      |                                                 ^~~~
../../wine/include/wine/debug.h:477:36: note: dans l'expansion de la macro « __WINE_DPRINTF »
  477 | #define WINE_TRACE                 __WINE_DPRINTF(_TRACE,__wine_dbch___default)
      |                                    ^~~~~~~~~~~~~~
../../wine/include/wine/debug.h:520:36: note: dans l'expansion de la macro « WINE_TRACE »
  520 | #define TRACE                      WINE_TRACE
      |                                    ^~~~~~~~~~
../../wine/dlls/nvcuda/nvcuda.c:1009:5: note: dans l'expansion de la macro « TRACE »
 1009 |     TRACE("LUID ? (%s)", *luid);
      |     ^~~~~
../../wine/dlls/nvcuda/nvcuda.c:1009:21: note: la chaîne de format est définie ici
 1009 |     TRACE("LUID ? (%s)", *luid);
      |                    ~^
      |                     |
      |                     char *
      |                    %d

And I got this result:

04d0:trace:nvcuda:wine_cuDeviceGetLuid LUID ? ((null))(0x7fd2d0090000, 0, (nil))
jp7677 commented 2 years ago

That’s still the pointer, what happens with %s and without the asterisks, thus luid instead of *luid? Or with debugres(luid)?

PetitMote commented 2 years ago

I didn’t want to find out which was the right one :smile:

04c0:trace:nvcuda:wine_cuDeviceGetLuid LUID* s ((null))
04c0:trace:nvcuda:wine_cuDeviceGetLuid LUID s ()
04c0:trace:nvcuda:wine_cuDeviceGetLuid LUID* d (0)
04c0:trace:nvcuda:wine_cuDeviceGetLuid LUID d (349801784)
PetitMote commented 2 years ago

That’s still the pointer, what happens with %s and without the asterisks, thus luid instead of *luid? Or with debugres(luid)?

Also, I wanted to try debugres, but it seems the tool is no longer in wine