gfx-rs / portability

Vulkan Portability Implementation
Mozilla Public License 2.0
383 stars 25 forks source link

Out-of place shadows on Dolphin #183

Closed kemenaran closed 3 years ago

kemenaran commented 5 years ago

Dolphin runs great when running using libportability (rather than the default packaged MoltenVK).

However there is one graphical glitch appearing in several games : the player shadow is duplicated in the top left corner of the screen.

It looks like an issue specific to the libportability layer – as it doesn't show when using MoltenVK (or another Dolphin backend). I hope this is a good place to report such issues.

Screenshots

Wind Waker

Notice the boat shadow on the top-left corner of the screen, in the sky.

Wind Waker

1080° Avalanche

Notice the surfer shadow on the top-left corner of the screen, under the HUD.

1080 Avalanche

Testing environment

Tested by running LIBVULKAN_PATH=$HOME/gfx-portability-0.6/libportability.dylib /Applications/Dolphin.app/Contents/MacOS/Dolphin.

kemenaran commented 5 years ago

FIFO log for reproducing the issue within Dolphin: WindWakerFIFO.dff.zip

kvark commented 5 years ago

Thank you for the detailed issue! Sorry for the silence - I'm just back from vacation, where I was completely off the grid. Will take a look shortly.

kvark commented 5 years ago

@kemenaran I just tested 1080° Avalanche on Intel GPU (Iris 550) with macOS 10.14.5 and wasn't able to reproduce the issue - game looks normal to me. Are there any specific settings you have changed?

kvark commented 5 years ago

@kemenaran just to be completely paranoic, I downloaded the exact Dolphin and libportability versions, launched with the same command line, and straight opened the FIFO log you provided. It shows the Wind Waker without any issues.

Could you provide more information about the hardware and settings?

kemenaran commented 5 years ago

Oh, that's strange. I just checked again with a fresh build of Dolphin, and the issue still appears on my machine.

To be more precise, the integrated GPU on my machine is reported as Iris 5100. The machine specs are the ones mentioned on this page.

(And I know this Intel Iris version specifically seems prone to driver bugs… More than chips a few revisions later.)

Miksel12 commented 3 years ago

@kemenaran Is this still a problem? I have seen quite some reports on issues with MoltenVK and considering the performance improvement of gfx over MoltenVK I thought it might be interesting to open a PR to change from MVK to gfx (and to get more data on possible shortcommings of gfx). Though, I don't have a mac so I thought maybe you or @kvark could open a PR for Dolphin.

kemenaran commented 3 years ago

Yes, the same problem still occurs on my hardware. (Just tested again with libportability-0.8.1 and Dolphin 5.0-13129).

That said, I would be all in favor of replacing MoltenVK by gfx in Dolphin, even if there's a small regression on this specific hardware. gfx performs much better than MoltenVK, and this would probably give this specific problem more exposure, and help to get it fixed.

kvark commented 3 years ago

I'll be happy to investigate and fix this as soon as I can reproduce. Alternatively, if any of you folks could make a Metal GPU trace and share it, I'd be happy to look at it as well.

kemenaran commented 3 years ago

Thanks. I know how to generate a FIFO log for Dolphin, but if the issue is hardware-dependent this won't help to reproduce the issue. Do you have any pointers on how I can generate a suitable Metal GPU trace?

kvark commented 3 years ago

It requires launching the app from XCode, and then there is a button there on UI to make a GPU capture. Once done, it can be exported to a file. See https://github.com/gfx-rs/wgpu/wiki/Debugging-with-Xcode for more pointers.

Another (simpler) thing one could do is running with METAL_DEVICE_WRAPPER_TYPE=1 in the environment. Some errors may show up hinting about the problem.

kemenaran commented 3 years ago

Thanks for the debugging guide. The METAL_DEVICE_WRAPPER_TYPE=1 didn't report any error.

So here's a capture of a faulty frame:

Capture d’écran 2020-12-01 à 00 15 34

And the associated traces:

(As you know, you can load the FIFO log into Dolphin to render the same frame on your hardware.)

Capture environment

Tested by running LIBVULKAN_PATH=$HOME/gfx-portability-0.8.1/libportability.dylib /Applications/Dolphin.app/Contents/MacOS/Dolphin.

kvark commented 3 years ago

This is amazing, thank you for help! I can see the issue on my end now.

kvark commented 3 years ago

Here is what I got. First, I see the issue when opening the GPU capture. They may be the case because (due to some specifics about how Dolphin processes frames) the frame is not self-contained, and there is just input data saved that's already broken. I tried running Dolphin on the latest gfx-portability (on Intel/macOS 11), had to fix a small thing, but otherwise not seeing the issue. Here is the replay of the FIFO log you provided: latest-replay

Taking a GPU capture from my run and comparing it to your GPU capture also seems a bit off, since my render graph shows more nodes, and it's different. I'll keep looking for clues.

kvark commented 3 years ago

Here is the exact binary I used - libportability.dylib.zip It's just a debug build from #233 code. All the regular "binary downloaded from the internet" caution should still apply! If you are able to run with it and still see the issue, we are possibly dealing with a driver bug that for some reason doesn't manifest itself when running on MoltenVK.

kemenaran commented 3 years ago

So, I ran the FIFO log using your debug build – and the issue is gone! The shadow is rendered correctly, without being overlapped over the scene.

Capture d’écran 2020-12-01 à 11 46 51

While loading the FIFO log, the console gave me some logs, but I guess nothing related to the issue:

gfx-portability backend: Metal
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-1.5) is not supported
[2020-12-01T10:46:37Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:38Z ERROR gfx_backend_metal::device] Lod bias Lod(-0.5) is not supported
[2020-12-01T10:46:38Z ERROR gfx_backend_metal::device] Lod bias Lod(-1.5) is not supported

Here is a GPU trace of the same frame being rendered using your gfx debug build (and so with a correct output on my machine):

kemenaran commented 3 years ago

Further testing: I compiled #233 on my own machine, and the graphics issue is also gone when compiling in --release mode.

So I guess that's it: don't know what caused the original issue, but it seems fixed on master.

kvark commented 3 years ago

Wow, that's great news! Going to close this now. Thank you for all the help!