NVIDIAGameWorks / NRI

Low-level abstract render interface
MIT License
218 stars 26 forks source link

[RFE] Instancing problem in GPU-driven rendering #61

Closed vertver closed 6 months ago

vertver commented 6 months ago

I'm trying to implement GPU-driven rendering with bindless descriptors and instances, and I have a problem with instance (specifically on D3D12). SV_InstanceID on D3D11/D3D12 starts from 0. However, in Vulkan it starts from firstInstance in draw indexed instanced command. That means that I can't use instance id in D3D12 or D3D11 because it always starts from 0. I can hack this and use additional vertex instance buffer with instance index for each draw, but this method adds complexity. So what do you think, how can we fix this problem?

https://github.com/gpuweb/gpuweb/issues/901 - Problem description

vertver commented 6 months ago

Now it's working. Waiting for your feedback image

dzhdanNV commented 6 months ago

"D3D12 + bindless" flickers on RTX 4080. Objects randomly disappear. I'm trying to understand what's going on...

dzhdanNV commented 6 months ago

Works now. The root cause is: BEFORE:

groupshared uint DrawCount = 0;

It's NOP, compiler generates a warning that it's ignored. AFTER:

groupshared uint DrawCount;

[numthreads(THREAD_COUNT, 1, 1)]
void main(uint ThreadID : SV_DispatchThreadId)
{
    if (ThreadID == 0)
        DrawCount = 0;

    GroupMemoryBarrierWithGroupSync();

Polishing, thinking, improving...

dzhdanNV commented 6 months ago

Thanks for your work. Attached is the polished version. Sorry, I'm lazy to mess with GitHub :) Please, copy over all files (assuming you haven't changed anything in the past 16 hours), and arrange two MRs:

I will accept them, merge in my unrelated changes, update GitHub and make a new release.

NRI changes:

SAMPLE changes:

Modified.zip

dzhdanNV commented 6 months ago

Forgot to mention: Please, read comments in NRICompatibility.hlsli (on top) and in NRIDescs.h (where command signatures are).

My personal goal is to not lose the default behavior. To reach this:

Open question - RWByteAddressBuffer is undesired... Is it worth switching to RWStructuredBuffer?

vertver commented 6 months ago

Open question - RWByteAddressBuffer is undesired... Is it worth switching to RWStructuredBuffer?

Yes, but it requires using a different structure on each platform. I would prefer more RWBuffer<uint> instead of RWByteAddressBuffer because it doesn't need to provide a stride and can also be cleared by commands.

dzhdanNV commented 6 months ago

RWBuffer<uint> is a good candidate too, I agree. Strictly speaking BaseVertex can be negative, hope casting forth and back won't break anything. Stride is needed in any case (if I understand correctly).

vertver commented 6 months ago

PR is ready - https://github.com/NVIDIAGameWorks/NRI/pull/64

vertver commented 6 months ago

Oups, some problems with Vulkan... fixing them

vertver commented 6 months ago

Fixed, but Vulkan still has 2 allocations on app close

dzhdanNV commented 6 months ago

Fixed, but Vulkan still has 2 allocations on app close

Validation layer enabled? If yes, it's a VK problem.

Sorry about this: MY:

    return ((uint64_t)rootSignature && ((1ull<<52) - 1)) | stride;

FIXED:

    return (stride << 52ull) | ((uint64_t)rootSignature & ((1ull << 52) - 1));

(Will be) fixed in the main branch.

dzhdanNV commented 6 months ago

Merged. Thank you!