Closed vertver closed 6 months ago
Now it's working. Waiting for your feedback
"D3D12 + bindless" flickers on RTX 4080. Objects randomly disappear. I'm trying to understand what's going on...
Works now. The root cause is: BEFORE:
groupshared uint DrawCount = 0;
It's NOP, compiler generates a warning that it's ignored. AFTER:
groupshared uint DrawCount;
[numthreads(THREAD_COUNT, 1, 1)]
void main(uint ThreadID : SV_DispatchThreadId)
{
if (ThreadID == 0)
DrawCount = 0;
GroupMemoryBarrierWithGroupSync();
Polishing, thinking, improving...
Thanks for your work. Attached is the polished version. Sorry, I'm lazy to mess with GitHub :) Please, copy over all files (assuming you haven't changed anything in the past 16 hours), and arrange two MRs:
I will accept them, merge in my unrelated changes, update GitHub and make a new release.
NRI changes:
NRICompatibility.hlsli
moved to NRIbaseAttribute(s)
=> drawParameters
NRIDescs.h
enableDrawParametersEmulation
CommandBufferD3D12::Draw
DeviceDesc::isDrawParametersEmulationEnabled
, which can be false
if emulation is requested but SM6.8 is supportedprintf
999
register space to avoid conflictsSAMPLE changes:
Forgot to mention:
Please, read comments in NRICompatibility.hlsli
(on top) and in NRIDescs.h
(where command signatures are).
My personal goal is to not lose the default behavior. To reach this:
DeviceDesc::isDrawParametersEmulationEnabled
, which can be false
if SM6.8 is supportedPipeline
gets "emulation" on if emulation is requested and "confirmed" by the backendNRI_ENABLE_DRAW_PARAMETERS_EMULATION
defined prior NRICompatibility.hlsli
inclusion. If SM6.8 is supported, it will be ignored, so we are on the safe sideOpen question - RWByteAddressBuffer
is undesired... Is it worth switching to RWStructuredBuffer
?
Open question - RWByteAddressBuffer is undesired... Is it worth switching to RWStructuredBuffer?
Yes, but it requires using a different structure on each platform. I would prefer more RWBuffer<uint>
instead of RWByteAddressBuffer
because it doesn't need to provide a stride and can also be cleared by commands.
RWBuffer<uint>
is a good candidate too, I agree. Strictly speaking BaseVertex
can be negative, hope casting forth and back won't break anything. Stride is needed in any case (if I understand correctly).
PR is ready - https://github.com/NVIDIAGameWorks/NRI/pull/64
Oups, some problems with Vulkan... fixing them
Fixed, but Vulkan still has 2 allocations on app close
Fixed, but Vulkan still has 2 allocations on app close
Validation layer enabled? If yes, it's a VK problem.
Sorry about this: MY:
return ((uint64_t)rootSignature && ((1ull<<52) - 1)) | stride;
FIXED:
return (stride << 52ull) | ((uint64_t)rootSignature & ((1ull << 52) - 1));
(Will be) fixed in the main branch.
Merged. Thank you!
I'm trying to implement GPU-driven rendering with bindless descriptors and instances, and I have a problem with instance (specifically on D3D12).
SV_InstanceID
on D3D11/D3D12 starts from 0. However, in Vulkan it starts fromfirstInstance
in draw indexed instanced command. That means that I can't use instance id in D3D12 or D3D11 because it always starts from 0. I can hack this and use additional vertex instance buffer with instance index for each draw, but this method adds complexity. So what do you think, how can we fix this problem?https://github.com/gpuweb/gpuweb/issues/901 - Problem description