RPCS3 / rpcs3

PlayStation 3 emulator and debugger
https://rpcs3.net/
GNU General Public License v2.0
15.39k stars 1.91k forks source link

rsx: Broken depth on AMD GPUs (GoW 3 Demo) #5830

Closed AniLeo closed 5 years ago

AniLeo commented 5 years ago

Opening issue for tracking purposes, there's a known issue with depth that affects shadows on AMD GPUs. On NVIDIA they work fine.

Demonstration of issue (R9 280X): https://www.youtube.com/watch?v=uakYMsKwd1U

Working fine on NVIDIA (GTX 1080): Screenshot1 Screenshot2

I'm not aware whether this affects full game, as I didn't get it yet.

kd-11 commented 5 years ago

This appears to be some kind of spec violation or oversight on khronos' part. It is not well defined what should happen with border texels when a compare operation is also active. The spec has this special condition governing replacement of border texels:

If the read is the result of an image sample instruction or image gather instruction

When doing comparison, the operation is not a real image sample operation as we do not get back a sample, we get back a comparison result. When comparison is enabled this does not happen for the DEPTH32_SFLOAT_S8_UINT format on both AMD and NVIDIA hardware. Border comparison always fails in this case which is not the correct solution as it should replace the invalid fetch with border color which is a float4(1) in this case and then do comparison after the fact. This very much looks to be a driver bug, but for the moment I'm designing a simple workaround just in case IHVs are not interested in fixing it. Can anyone confirm if intel drivers are also affected by this issue?

kd-11 commented 5 years ago

After looking deeper into the issue, it appears it may be more complex than initially thought. In the GoW case, the texel replacement does not seem to be the culprit as the SamplerAddressingMode is quite different (ClampToEdge, unlike Yakuza series which uses ClampToBorder).

kd-11 commented 5 years ago

This build fixes this issue. This is not really a bug, its just how nvidia is doing the comparison. DEPTH24 is a fixed point format so when comparing D to Dref, NVIDIA hw is converting Dref into fixed point as well so any values larger than 1 are getting truncated. When using DEPTH32_FLOAT, the storage format is in FP32 and comparing the incoming Dref fails if it has a value larger than 1 (overflow). For now, all drivers with D24 missing are emulated in software (we still get PCF using textureGather so no big loss there). The same is done for NVIDIA hw if the 'Use high precision Z' option is enabled.

AniLeo commented 5 years ago

image

Confirmed fixed

kd-11 commented 5 years ago

Fixed by https://github.com/RPCS3/rpcs3/pull/5860