HansKristian-Work / vkd3d-proton

Fork of VKD3D. Development branches for Proton's Direct3D 12 implementation.
GNU Lesser General Public License v2.1
1.91k stars 199 forks source link

Remnant II: flickering vegetation when VK_EXT_mesh_shader is not disabled #1664

Open Saancreed opened 1 year ago

Saancreed commented 1 year ago

When playing without VKD3D_DISABLE_EXTENSIONS=VK_EXT_mesh_shader in the environment, the vegetation is rapidly flickering. The issue can be observed even in Ward 13 but it's much more severe on Yaesha.

With VKD3D_DISABLE_EXTENSIONS=VK_EXT_mesh_shader:

https://cdn.discordapp.com/attachments/855613305296257054/1143228467668860999/R2_NoMesh.mp4

Without VKD3D_DISABLE_EXTENSIONS=VK_EXT_mesh_shader:

https://cdn.discordapp.com/attachments/855613305296257054/1143228469526941787/R2_Mesh.mp4

Software information

Remnant II, settings around High, happens both with DLSS Quality and with upscaling disabled

System information

Log files

steam-1282100.log

doitsujin commented 1 year ago

how long does it take to get to a place to repro this?

(I'm assuming that sharing saves isn't possible with this game because, well, online memes)

Saancreed commented 1 year ago

About ~10-15 minutes I'd say? You have to explore a bit and complete the fight with the first boss, that's all. And save sharing should actually be possible, seeing how there is a bunch of save files on Nexusmods and such.

runar-work commented 1 year ago

I took a look at the leaves in Ward 13, but I haven't been able to repro this so far with a GTX 1660 Ti, at least.

Blisto91 commented 1 year ago

Just noting here also that i have not been able to reproduce on a 7900xtx.

dlshinobi commented 1 year ago

Can confirm objects still flickering like crazy on nVidia RTX 4080 (drivers: 535.98, vkd3d-proton: master)

video at the start of the game: https://drive.google.com/file/d/1nSCqIOclPOIuR6wdWm-VpYV5XGzyz1mP/view?usp=drive_link https://drive.google.com/file/d/1nTBBAITZwaNG6qRPVnuQIjz2m2gWyT8Z/view?usp=drive_link

doitsujin commented 1 year ago

Unable to reproduce on either the RTX 2060 or RX 6900XT, seems like this is specific to 4000 series.

HansKristian-Work commented 1 year ago

@runar-work https://github.com/HansKristian-Work/vkd3d-proton/pull/1675 should work around it.

HansKristian-Work commented 1 year ago

I'm convinced now this is an NV compiler bug that affects Ada only. There is something happening with the vertex allocation scheme that is completely bogus.

The shader is actually okay here as far as I can tell.

Gigabyte29 commented 1 year ago

I have the same issue with the flickering vegetation on my RTX 3080 as well and i can confirm that using VKD3D_DISABLE_EXTENSIONS=VK_EXT_mesh_shader fixes the issue for me too. I'm on the latest nvidia driver 535.104.05.

doitsujin commented 1 year ago

First time we hear about anyone running this on RTX 3000 series, so I guess it's those two architectures. Turing is not affected.

But yeah, since this looks like a straight-up driver bug there's not much we can do; disabling mesh shaders on our end isn't really a great option either considering that the legacy path in UE5 is (generally) significantly slower on GPUs where mesh shaders actually work properly.

dlshinobi commented 9 months ago

Just tested game with the latest nVidia 550 (beta) drivers, sadly the issues still persist. P.S. GPU: RTX 4080

adamnv commented 9 months ago

I'm relaying this analysis from NVIDIA's compiler folk, so apologies if I'm mangling it in the process, but the conclusion (as I understand it) is that the generated SPIR-V is actively using an uninitialized variable:

       %main = OpFunction %void None %3
          %5 = OpLabel
>>>>    %149 = OpUndef %uint
...
       %1292 = OpLabel
        %147 = OpPhi %uint %86 %1290 %141 %1291
>>>>    %148 = OpPhi %uint %149 %1290 %143 %1291
>>>>    %150 = OpPhi %uint %149 %1290 %145 %1291

Full spirv dump attached: spv.txt

Couple of notes:

HansKristian-Work commented 9 months ago

Thanks. I'll investigate further.

doitsujin commented 9 months ago

@adamnv Thanks for looking into this.

However, I don't think the undefined value should be the issue here if we look at the actual uses more closely:

%1290 = OpLabel
...
        OpBranchConditional %99 %1291 %1292
%1291 = OpLabel
...
        OpBranch %1292
%1292 = OpLabel
 %147 = OpPhi %uint %86 %1290 %141 %1291

// %148 and %150 are well-defined if we come from %1291, i.e. if %99 was true
 %148 = OpPhi %uint %149 %1290 %143 %1291
 %150 = OpPhi %uint %149 %1290 %145 %1291
...
 %586 = OpISub %uint %150 %148

// Only uses %148 and %150 if %99 was true, fallback value otherwise
 %587 = OpSelect %uint %99 %586 %571
 %588 = OpSelect %uint %99 %148 %uint_0

In other words, the pattern is similar to this, where the undefined values always end up being discarded:

uint x, y;
if (cond) {
  x = foo;
  y = bar;
}
...
uint value = cond ? x + y : 0u;

I also noticed that this shader exports a clip distance, which isn't well-tested. I wrote a test now which passes on Nvidia drivers, but I'll still look into whether or not this could be the culprit, especially. w.r.t. invariance.

adamnv commented 9 months ago

Thanks for the info. Reckon you're right! Compiler folk are still investigating, I'll comment here if they have further findings.

doitsujin commented 8 months ago

@adamnv Testing this on a 4070 now. It looks like there's a problem with how the compiler handles SetMeshOutputsEXT inputs.

For reference, the shader I'm working with is 05d3d244d33727ed from Talos Principle 2.

If I manually insert some uniform control flow after SetMeshOutputsEXT that depends on the vertex and primitive count, the issue seems to go away. we'll have to implement this as a workaround in vkd3d-proton to confirm for sure whether this is actually a legitimate fix or not.

Specifically, changing

               OpSetMeshOutputsEXT %722 %611

to

               OpSetMeshOutputsEXT %722 %611
       %2002 = OpExtInst %uint %310 UMin %722 %611
       %2003 = OpGroupNonUniformBroadcastFirst %uint %uint_3 %2002
       %2004 = OpIEqual %bool %2003 %uint_0
               OpSelectionMerge %2501 None
               OpBranchConditional %2004 %2500 %2501
       %2500 = OpLabel
               OpReturn
       %2501 = OpLabel

or, in GLSL terms,

    SetMeshOutputsEXT(_813, _686);
    if (subgroupBroadcastFirst(min(_813, _686)) == 0u)
        return;

seems to make the problem go away without changing how the shader behaves (the condition is always false in practice).

adamnv commented 8 months ago

@doitsujin Thanks again for the new findings; forwarded to the compiler folk!