RenderKit / oidn

Intel® Open Image Denoise library
https://www.openimagedenoise.org/
Apache License 2.0
1.74k stars 160 forks source link

Hip sharing buffer with DX12 not working #165

Open Nielsbishere opened 1 year ago

Nielsbishere commented 1 year ago

Hi, cpu and cuda support is working fine for me but it seems like HIP is still not working. I'm getting a black output (but the devices do show up and dlls are in correct position). I've tested that oidn2 branch from the example fork linked from a different issue, but that seems to have the same issue (a few commits back). The current commit in that fork seems to crash because it picks on LUID which is not supported in HIP, but the one before that change has the same problem I do. Is that a problem that's fixed in the latest dev version?

As a tip for how to deal with lack of LUID currently: I currently use the UUID to distinguish how many unique GPUs are present in the system. If there's only one gpu then I can simply pick the first one. If there's multiple then I can pick the one that matches the adapter desc name from dx12. If there's multiple with the same name and different uuid, then I can fallback to cpu. I've tested this on different multi gpu setups and seems to work fine; unless two of the same name are installed.

atafra commented 1 year ago

Hi. From the issue title my understanding is that HIP does not work at all. Is this really the case, that HIP doesn't work under any circumstances? From the description it seems that you're using OIDN with buffer sharing and for that you get a black output. But this does not necessarily mean that HIP does not work at all with the current release. It might also mean that it's just buffer sharing that doesn't work. Could you please check without buffer sharing? By my experience buffer sharing for HIP devices kind of works on Linux (but the UUIDs mismatch so this shouldn't be used in production code!) but I also got a black output on Windows. I think buffer sharing simply doesn't work with HIP on Windows, and this is very likely a driver issue and not a bug in OIDN. HIP is not supported officially on Windows, so I'm not that surprised about this.

You're right that one could work around the lack of LUID, but I really do not recommend it. It has some flaws but most importantly it really hides the fact that something's wrong in the driver. And it is. If LUID is not reported or UUID does not match, it suggests that you really shouldn't try to match the device in some other way because there are issues in the driver. This is why I would highly recommend to avoid such workarounds and always fall back to copying through host memory in such cases. If you would do this in the application, as recommended by the documentation, with current AMD drivers the application would fall back to copying and OIDN would run correctly (but with an overhead).

The bottom line, if this is a driver issue (and I think very likely it is), then it should be fixed in the driver, and meanwhile applications should rely on the recommended fallback mechanism. What you're currently doing is useful as an experiment, as we discussed in a previous issue but there shouldn't any expectation about its correctness because this is an unsupported scenario and thus shouldn't be done in production code.

Nielsbishere commented 1 year ago

Aha, sounds good. We're on windows and it's indeed just with buffer sharing. I'll hold off on reporting this to our AMD rep as we filled a couple more bugs since the last HIP issue.

atafra commented 1 year ago

Could you please update the title to make it more specific? Just to avoid any confusion for other users.