RenderKit / oidn

Intel® Open Image Denoise library
https://www.openimagedenoise.org/
Apache License 2.0
1.74k stars 160 forks source link

Issues with directx 12 #162

Closed VZout closed 1 year ago

VZout commented 1 year ago

When creating a handle with CreateSharedHandle on windows and passing it to oidnNewSharedBufferFromWin32Handle with EXTERNAL_MEMORY_TYPE_FLAG_D3D12_RESOURCE gives me the error "invalid argument". However EXTERNAL_MEMORY_TYPE_FLAG_D3D12_HEAP does seem to work but then i run into the following:

if i use a standard committed texture resource for inputs/outputs i get a corrupted result (possibly because its not linearly tiled?). Trying to copy a texture into a buffer and then creating a oidn buffer from that for input gives me no output. One thing I haven't tried yet is using a D3D12_TEXTURE_LAYOUT_ROW_MAJOR texture resource (shouldn't be needed if im copy to a buffer I think?). (I'm using R32G32B32A32_FLOAT but am specifying the row stride and pixel stride correctly afaik)

Is the EXTERNAL_MEMORY_TYPE_FLAG_D3D12_HEAP/RESOURCE working as expected? And what is the flow/requirements for using dx12 input/ouputs? are buffers supposed to work? and if not do only row major textures work as input/output?

Nielsbishere commented 1 year ago

Fancy seeing you here @VZout. Afaik, the flow is the following: One resource (per input/output) is to be used only for interop output. This one may not have UAV, RTV or SRVs created. It should use cross adapter flag and be in a cross adapter + shared heap, it should also be non tiled (linear layout only). You can use copy texture region to get to & from this resource. So output needs a copy and 3 inputs too. I'm currently getting an invalid operation in oidn though: "exactly one of the external memory handle and name must be non-null". But CreateSharedHandle didn't generate an invalid HR or validation error, so that should be good. Keep in mind that if you use D3D12MA and supply uav flag anyways, it will silently remove the heap flags (in my version anyways) and the create shared handle will return a validation error. I'll let you know if I get it working.

atafra commented 1 year ago

@VZout First, it would help a lot if you could share what GPU are you using. Different vendors and drivers support different kinds of handles, could be buggy (very likely), etc. OIDN just calls the buffer import function of the underlying compute runtime (SYCL, CUDA or HIP), so whether the buffer creation succeeds is mostly up to that compute runtime / driver. Could you share the exact error you get? Is "invalid argument" the actual error string or just the type of the error (in which case what's the exact error string)?

As the handle type I would recommend to try OIDN_EXTERNAL_MEMORY_TYPE_FLAG_OPAQUE_WIN32 instead.

OIDN currently can import only buffers, not textures. The oidnNewSharedBufferFromWin32Handle function is specifically for importing buffers only. However, you can import a buffer that backs a texture with linear layout which is necessary because the runtime is not aware that you're actually importing a texture. This is all described in the documentation in more detail.

@Nielsbishere You are correct that if you want to share a texture with OIDN, it should be created with D3D12_HEAP_FLAG_SHARED | D3D12_HEAP_FLAG_SHARED_CROSS_ADAPTER flags and D3D12_TEXTURE_LAYOUT_ROW_MAJOR. But if you're doing this, I don't think you need to copy anything. At least it works fine on the GPUs I tried.

The "exactly one of the external memory handle and name must be non-null" error means that either both the handle and name are null, or both are non-null. Neither of these are correct. You can import a buffer either through a handle or a name (if you have a named handle), but not both. You should just simply pass the handle you get from CreateSharedHandle and set the name to null.

I don't have example code for sharing a linear texture, but I do have one for sharing a buffer: https://github.com/atafra/ChameleonRT/blob/0a8e4a17f3ced1d7461632933e4f12b7819daee8/backends/dxr/render_dxr.cpp#L140

Nielsbishere commented 1 year ago

Awesome, seems like that did something (still have some problems w textures tho). I tried keeping the name ar NULL before but that messed up because it was set to ID3D12Resource mode instead of the generic Win32 handle. The reason I did a copy is because textures with uav, rtv, srv and row major weren't allowed (at least not on nvidia and amd). So only way to get data across would be with a copy. The 64Ki swizzle or srv from a row major are optional features that neither supports. Thanks so much for the example, I think I'll be using a buffer too, since it seems like you are allowed to make those UAV as well (preventing a copy and ensuring the same exact data is there without dealing with any texture alignment).

atafra commented 1 year ago

I tried sharing row-major textures only on Intel GPUs but it seems other vendors do not support this. Buffers should definitely work with all vendors though.

VZout commented 1 year ago

Finally got buffers to work by not creating the filter every single frame from scratch. Not entirely sure why this caused issues for me. Closing for now but feel free to re-open. probably just user error on my part.

atafra commented 1 year ago

Glad to hear that it's working now! BTW creating the filter from scratch for every frame is very expensive and should be avoided anyway.

Nielsbishere commented 1 year ago

@atafra thanks for the help, with buffers it works properly. Still gotta move the filters to the resize function, but that should be trivial

atafra commented 1 year ago

@VZout There was a bug for D3D12_RESOURCE in the CUDA and HIP backends, so if you're using an NVIDIA or AMD GPU, this was the reason why it didn't work for you. This is now fixed in the devel branch.

However, OPAQUE_WIN32 is still the recommended handle type because this is the only one supported by all GPU vendors (D3D12_RESOURCE is currently not supported by Intel GPUs).