GPUOpen-Drivers / AMDVLK

AMD Open Source Driver For Vulkan
MIT License
1.7k stars 160 forks source link

Bug: Driver incorrectly returns first ray generation SBT record for all ray tracing programs #290

Closed natevm closed 2 months ago

natevm commented 1 year ago

If I write a program like this:

struct RayGen {
  int one;
  int two;
};

struct ClosestHit {
  int three;
  int four;
};

[[vk::shader_record_ext]]                                             
ConstantBuffer<RayGen> RGSBT;
[shader("raygeneration")]
void raygen()
{
  printf("raygen %d, %d\n", RGSBT.one, RGSBT.two);
  TraceRay(...);
}

[[vk::shader_record_ext]]
ConstantBuffer<ClosestHit > CHSBT;
[shader("closesthit")]
void closesthit(inout PayloadType payload)
{
  printf("closesthit %d, %d\n", CHSBT.three, CHSBT.four);
}

And then upload values 1 and 2 to the raygen SBT record, and then 3 and 4 to the closesthit SBT record, on NVIDIA cards, I correctly see

raygen 1 2 closesthit 3 4

but on AMD, I see

raygen 1 2 closesthit 1 2

For some reason, on AMD cards, the returned SBT record in all closest hits (at least on my RX 6750 XT) is incorrectly the raygen record's data...

natevm commented 1 year ago

Interestingly, trying another example codebase from here, https://github.com/Twinklebear/ChameleonRT/releases, I do not see this issue. So it appears to only happen under certain circumstances. EDIT, nevermind, I see issues here too, but only with the Vulkan backend.

image

versus this on an NVIDIA card image

natevm commented 1 year ago

I forked Sascha Willem's Vulkan Examples repo, and added a little example which demonstrates how to use SBT record data. The target is called "raytracingsbtdata. I have a PR open, but in the mean time the code for that can be found here: https://github.com/natevm/Vulkan On NVIDIA cards, I correctly see this: raytracingsbtdata_BmYgJJ4zqF

But on AMD cards, I see only red: raytracingsbtdata_o2oFH4oTXq

On AMD, the triangle in the center incorrectly receives the raygen record's SBT data.

natevm commented 1 year ago

discovering that I have a similar issue in the miss program, where the miss shader record returned by the AMD driver is actually the first raygen record's data.

natevm commented 1 year ago

I sort of wonder if AMD assumes the "MultiplierForGeometryContributionToHitGroupIndex" on a TraceRays is 0 when it isn't... EDIT: err, well, that doesn't explain the behavior of miss programs receiving the raygen record's data, since a miss program is independent of "MultiplierForGeometryContributionToHitGroupIndex"

jammm commented 1 year ago

Was able to reproduce it on my RX 6800 (navi21) On both debug and release modes: image

~Can someone look into this?~ I created an internal ticket to track this. Hopefully this gets fixed soon!