bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
35.04k stars 3.44k forks source link

The new ClearPass leads to flickering when fps is less than 60 #3307

Closed folke closed 2 years ago

folke commented 2 years ago

Bevy version

Latest git

Operating system & version

Fedora

What you did

cargo run --example 3d_scene_pipelined

What you expected to happen

The scene should render without flickering.

What actually happened

Sometimes the scene stays blank. Other times, it flickers between the correct frame and a blank frame.

This seems to be caused by the changes in #3209

Additional information

The scene runs at 40fps on my laptop. Most time seems to be spent in light clustering:

Improving the performance of the clustering might fix this issue, but the real issue seems to be that clear happens every frame, while new frames might not be ready yet, leading to the blank frames.

If I go to a commit right before the ClearPass, then the scene renders as expected.

mockersf commented 2 years ago

I tried reproducing by adding thread::sleep in a user system or in system assign_lights_to_clusters, lowered FPS to 15 but no flickering issues

folke commented 2 years ago

The shadow pass node seems to take 4ms and runs after Clear and before Main, so maybe that's why I'm seeing flickering? And that's why it worked before?

Try adding a sleep in the shadow pass. image

folke commented 2 years ago

The pbr_pipelined example normally runs as it should, but if I add a thread sleep of 16ms inside ShadowPassNode, then I get a black screen (in windowed mode). Whenever I move the mouse, I get flickering frames coming through.

What's weird though is that when I maximize the screen, things seem to work as expected.

Edit: still flickering on fullscreen, but much less than in windowed mode.

I have these issues on Linux both for X11 and Wayland.

mockersf commented 2 years ago

Can't reproduce on my mac with a sleep inside ShadowPassNode.

It may be a linux issue, or on how your card driver handles the command. Could you add the info from your gpu if someone can reproduce?

folke commented 2 years ago

This is on a Dell Xps 9310

AdapterInfo {
  name: "Intel(R) Xe Graphics (TGL GT2)",
  vendor: 32902,
  device: 39497,
  device_type: IntegratedGpu,
  backend: Vulkan
}

It might be a graphics card issue, but if I disable the ClearPass and do the clear as before inside MainPass3d, then it works again without flickering.

folke commented 2 years ago

I can reproduce the issue on my end with the following minimal example. The screen flickers very rapidly between a grey frame and a black window.

use bevy::{
    prelude::{App, Commands},
    render2::camera::PerspectiveCameraBundle,
    utils::Duration,
    PipelinedDefaultPlugins,
};

fn main() {
    App::new()
        .add_plugins(PipelinedDefaultPlugins)
        .add_startup_system(setup)
        .add_system(sleep)
        .run();
}

fn setup(mut commands: Commands) {
    commands.spawn_bundle(PerspectiveCameraBundle::default());
}

fn sleep() {
    std::thread::sleep(Duration::from_millis(20));
}
zamazan4ik commented 2 years ago

Do we need to test it on additional environments? I have:

folke commented 2 years ago

Did a lot more testing and I think I'm hitting three different issues, but they give similar results:

folke commented 2 years ago

@zamazan4ik would be great if you can confirm seeing the same behavior on your Fedora. (I'm testing on Fedora 35 with a Tiger Lake integrated GPU) Might also be good to check on Windows for sure.

NiklasEi commented 2 years ago

I can reproduce this on Ubuntu 18 (Intel® UHD Graphics 630). Attaching a heavily flickering run of cargo run --example 3d_scene_pipelined on latest main without any code changes.

https://user-images.githubusercontent.com/12236672/145731394-e8e27504-9a0f-4c44-9bb7-a26b1cde198a.mp4

Weasy666 commented 2 years ago

cargo run --example 3d_scene_pipelined and the minimal example work fine on Windows 10 with a Radeon RX 570 and auto-detected vulkan backend.

manokara commented 2 years ago

Running 3d_scene_pipelined on a Ryzen 5 2400G APU (Vega 11) on Linux (Solus, Linux v5.14.16), using open-source amdgpu drivers.

AdapterInfo { 
    name: "AMD RADV RAVEN",
    vendor: 4098,
    device: 5597,
    device_type: IntegratedGpu,
    backend: Vulkan 
}

It could definitely be a GPU issue. From what I've heard Intel's drivers for Vulkan on Linux are not very good.

manokara commented 2 years ago

Ok so I just remembered I also had an Intel laptop around: An i5-3210M (HD Graphics 4000) Ivy Bridge processor running Arch Linux.

AdapterInfo {
    name: "Intel(R) HD Graphics 4000 (IVB GT2)",
    vendor: 32902,
    device: 358,
    device_type: IntegratedeGpu,
    backend: Vulkan
}

Maybe it's still an Intel issue, but only with newer iGPUs? We'll need more testing from people with Intel chips to see. This might be a stretch, but there could be other factors involved such as monitor refresh rates? Still, using the older method the flickering doesn't happen, so... stuff is happening.

folke commented 2 years ago

@manokara I think the main issue with ClearPass only occurs when the ShadowPass takes a noticable amount of time. This is the order of the renderpasses:

  1. ClearPass: clears the entire frame
  2. ShadowPass: render shadows. This pass takes 4ms for me, so the screen will be only showing shadows for 4ms
  3. MainPass: here the rest of the scene is added again

Before the ClearPass PR, clearing and painting the whole scene happened at the same time.

Could you see what happens if you add a thread::sleep inside the ShadowPassNode? I don't see why this would not lead to the visible flickering. Unless I'm totally misunderstanding how all of this works, which is very likely :)

folke commented 2 years ago

I was able to fix the flickering for X11 by disabling vsync (either Immediate or Mailbox are fine. Fifo triggers the issue).

For Wayland, the issue still persists.

manokara commented 2 years ago

I assume that by "putting a sleep in ShadowPassNode" means putting it in its Node::run implementation, right? It just lowers the framerate, no flickering at all. And that wouldn't really make sense if you think about it, because the entire window goes blank.

The problem might be somewhere in the process of passing graphics data to wgpu, maybe double-buffering related. Until recently the rationale was that it's Intel related, but it works fine on my 3210M (albeit slow) as I commented before.

cart commented 2 years ago

This seems like a logical error somewhere below us in the stack (wgpu or a driver). The final output from our render graph is a single command queue that does clear passes first, then draws on top of them after that. Timing shouldn't come into play here because the final "command list" submitted to the gpu will be identical no matter how long each step takes.

Pulling in wgpu folks to get their take on this: @kvark and @cwfitzgerald.

kvark commented 2 years ago

Does it reproduce if you capture an API trace and then replay it, either locally or on a different machine? https://github.com/gfx-rs/wgpu/wiki/Debugging-wgpu-Applications#tracing-infrastructure Could somebody post repro instructions?

cart commented 2 years ago

Someone experiencing this issue can capture the trace by running the lighting example on main (where the new renderer is now the default and the 3d_scene_pipelined example has been renamed to lighting):

cargo run --example lighting --features wgpu_trace

The trace should show up in a wgpu_trace folder in the root of your project.

folke commented 2 years ago

@kvark I've attached two traces for the lighting example on X11 and Wayland.

Both casue the flickering (wgpu 0.11.5).

When I maximize the windows, the flickering disappears. When I disable vsync (use Mailbox or Immediate mode), then the flickering goes away for the X11 build, but remains in the wayland build.

Thank you for looking into this!

AdapterInfo {
  name: "Intel(R) Xe Graphics (TGL GT2)",
  vendor: 32902,
  device: 39497,
  device_type: IntegratedGpu,
  backend: Vulkan
}

wayland.zip x11.zip

kvark commented 2 years ago

Replaying the first capture doesn't show me any flickering. So this isn't about any kind of logical ordering issues within wgpu itself. The only strange thing I noticed in the trace is your surface being the 2nd surface. What happened to the first? Is there any way you could test this without creating extra surfaces, just to be sure? And again, what would be good repro steps for me?

folke commented 2 years ago

I don't fully understand the Bevy code, but I was able to disable the creation of the first surface, so there's only one being used. The flickering is still happening though.

@cart a surface gets created in bevy_render/lib:116 and then another one bevy_render/view/window:132.

To reproduce the issue, I clone the bevy repo and then run cargo run --example 3d_scene.

kvark commented 2 years ago

Running this example on Bevy revision c825fda74a7c81b4e904f0c579b13a3b115d346d doesn't expose any flickering to me on Linux.

AdapterInfo { name: "Intel(R) Xe Graphics (TGL GT2)", vendor: 32902, device: 39497, device_type: IntegratedGpu, backend: Vulkan }

folke commented 2 years ago

you have the same graphics card as I have? What distro are you running?

The results seem to be different for me whether I run the X11 build under Xorg, or under XWayland (the latter gives much more flickering). The most flickering is for Wayland builds under Wayland.

The flickering only seems to happen for apps where present_frame is not called within 16ms.

You could try the lighting example cargo run --example 3d_scene. This one is heavier and will hopefully show the flickering.

kvark commented 2 years ago

Yes, looks like I have the same GPU, just in a different laptop. I'm running NixOS with KDE (with X11). VulkanInfo shows the following:

        apiVersion        = 4202678 (1.2.182)
        driverVersion     = 88088581 (0x5402005)
        vendorID          = 0x8086
        deviceID          = 0x9a49
        deviceType        = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
        deviceName        = Intel(R) Xe Graphics (TGL GT2)

Notice the driver version and see how it compares to yours.

That 3d_scene doesn't make a difference for me. Still rendering fine.

KirmesBude commented 2 years ago

To add on to your point of graphics card and distro: What Mesa Version are you running?

There are recent issues on the mesa repository concerning flickering on intel graphics. https://gitlab.freedesktop.org/mesa/mesa/-/issues/5731 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5744 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5745

Edit: Though the flickering there seems to be different from here. So maybe unrelated.

folke commented 2 years ago

Nope, you're right! If I downgrade mesa-* to 21.2.3, then the flickering is gone, so definitely an issue with the new mesa drivers.

folke commented 2 years ago

Seems like a fix is underway https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14207

Fixes the flickering in https://gitlab.freedesktop.org/mesa/mesa/-/issues/5744

NiklasEi commented 2 years ago

I can reproduce that downgrading mesa to v21.2.5 fixes the issue.

Does someone have the setup to locally build and install mesa from source? It would be nice to know if https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14207 resolves our issue.

The documented build & install steps in the Readme fail for me at meson .. :disappointed:

folke commented 2 years ago

@NiklasEi there's a new PR for the issue https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14237

I'll just wait it out for now.

kvark commented 2 years ago

I can't build this right now because it requires libdrm that is too new, but generally building Mesa was straightforward for me, so I'd like to encourage you to try building it and resolve issues on Bevy discord (can cc me if needed). It would let Intel devs know that the patch is good, or not, before merging.

folke commented 2 years ago

I just installed the new Fedora 35 21.3.2 build in pending/testing and everything works as expected now!

NiklasEi commented 2 years ago

Version 21.3.2~kisak1~b also removes all flickering for me on Ubuntu 18. The examples look so much better now :tada: