Closed eero-lehtinen closed 7 months ago
Can you reproduce this upstream on winit
's examples? If so, please open an issue there and link this issue for us <3
They mention testing a winit
example:
I tested an example from the winit repository
cargo run --example window
and there was no issues.
It might be reproducible on wgpu
instead
Yep I tested winit
upstream examples and found no issues. Now I also tested wgpu
upstream examples and found no issues. Seems Bevy specific.
I tested older versions of bevy and it looks like 0.9 is the latest version that works
Bisected the issue to the introduction of pipelined rendering 2027af4c54082007fed2091f112a11cb0bc5fc08 :)
FYI @james7132 @hymm. Thanks a ton for bisecting!
Also works on main if I disable the multi-threading
feature as it disables pipelined rendering.
I just ran into this myself. Arch linux with nvidia drivers 550.54.14-2
.
I have the same issue on the newest nvidia drivers and X11
Also works on main if I disable the
multi-threading
feature as it disables pipelined rendering.
after disabling multithreading i'm getting the following error:
thread 'main' panicked at /home/colin/.cargo/registry/s
c/index.crates.io-6f17d22bba15001f/bevy_render-0.13.0/s
c/view/window/mod.rs:346:18:
Error configuring surface: Outdated
note: run with `RUST_BACKTRACE=1` environment variable
o display a backtrace
Encountered a panic in system `bevy_render::view::windo
::prepare_windows`!
I'm also suffering from this issue:
I did some more investigation and found that we call surface.get_current_texture()
directly after WindowEvent::Resized
, while wgpu
examples only do it on WindowEvent::RedrawRequested
. If I modify the wgpu
examples to do that, they crash too.
Excellent digging. I'd be happy to review a PR that changes this behavior in Bevy: it sounds like we're doing work overly eagerly for no real benefit.
I'm not actually sure if our code is too eager. It's just that with this specific driver surface.get_current_texture
doesn't work if called just after surface.configure
(or it works once at startup but not on resize). Maybe as a workaround we could skip drawing a frame on resize? I'm not really familiar with bevy or wgpu internals though, just testing stuff out.
2024-03-06T20:26:51.531143Z DEBUG present_frames: wgpu_core::present: Removing swapchain texture Id(12,115,vk) from the device tracker
2024-03-06T20:26:51.549641Z DEBUG present_frames: wgpu_core::present: Presented. End of Frame
2024-03-06T20:26:51.550759Z DEBUG wgpu_core::device::global: configuring surface with SurfaceConfiguration { usage: TextureUsages(RENDER_ATTACHMENT), format: Bgra8UnormSrgb, width: 1493, height: 840, present_mode: Fifo, desired_maximum_frame_latency: 2, alpha_mode: Auto, view_formats: [] }
2024-03-06T20:26:51.550953Z WARN wgpu_hal::vulkan::conv: Unrecognized present mode 1000361000
2024-03-06T20:26:51.551007Z DEBUG wgpu_core::device::life: Active submission 229 is done
VUID-VkSwapchainCreateInfoKHR-pNext-07781(ERROR / SPEC): msgNum: 1284057537 - Validation Error: [ VUID-VkSwapchainCreateInfoKHR-pNext-07781 ] | MessageID = 0x4c8929c1 | vkCreateSwapchainKHR(): pCreateInfo->imageExtent (1493, 840), which is outside the bounds returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR(): currentExtent = (1477,826), minImageExtent = (1477,826), maxImageExtent = (1477,826). The Vulkan spec states: If a VkSwapchainPresentScalingCreateInfoEXT structure was not included in the pNext chain, or it is included and VkSwapchainPresentScalingCreateInfoEXT::scalingBehavior is zero then imageExtent must be between minImageExtent and maxImageExtent, inclusive, where minImageExtent and maxImageExtent are members of the VkSurfaceCapabilitiesKHR structure returned by vkGetPhysicalDeviceSurfaceCapabilitiesKHR for the surface (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkSwapchainCreateInfoKHR-pNext-07781)
Objects: 0
2024-03-06T20:26:51.559037Z INFO wgpu_hal::vulkan::instance: GENERAL [NVIDIA (0x4)]
Requested image extent (1493x840) does not match surface (1477x826), marking swapchain out of date
2024-03-06T20:26:51.559068Z INFO wgpu_hal::vulkan::instance: objects: (type: SWAPCHAIN_KHR, hndl: 0x7e7f1801b060, name: ?)
thread 'Compute Task Pool (3)' panicked at crates/bevy_render/src/view/window/mod.rs:324:26:
Error reconfiguring surface: Outdated
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Encountered a panic in system `bevy_render::view::window::prepare_windows`!
thread 'Compute Task Pool (7)' panicked at crates/bevy_render/src/pipelined_rendering.rs:49:67:
called `Result::unwrap()` on an `Err` value: RecvError
I get this error when running with validation layers. Though all vk calls return success.
I did some more investigation and found that we call
surface.get_current_texture()
directly afterWindowEvent::Resized
, whilewgpu
examples only do it onWindowEvent::RedrawRequested
. If I modify thewgpu
examples to do that, they crash too.
surface.get_current_texture()
is called in the render app, so it's mostly unrelated to the order Bevy receives events from winit
surface.get_current_texture()
is called in the render app, so it's mostly unrelated to the order Bevy receives events from winit
I guess what I should have said is that we call both surface.configure
and surface.get_current_texture
at the same time in the render app after the window size has changed, while wgpu
examples call them in separate winit events (Resized
and RedrawRequested
). That maybe gives the driver enough time to finish processing and not error out, at least it doesn't crash. Though using events like that might not be applicable in the bevy architecture.
@eero-lehtinen could you try with this branch? https://github.com/mockersf/bevy/tree/nvidia-outdated
I'm not sure how that will behave, it may fix the issue or make it worse 😄
@eero-lehtinen could you try with this branch? https://github.com/mockersf/bevy/tree/nvidia-outdated
Looks like that just works :D. I think I also at some point tried just blindly return
ing but had bad results, but your version is fine.
Though was that previous retrying behaviour needed on other platforms? wgpu
examples do it too.
I ported the fix to my game and there are definitely still situations where it can crash. E.g. switching between fullscreen and windowed quickly at startup and resizing the window when it is one of the two windows tiled side by side. The error is the same.
It seems to break when there are resizes in consecutive frames, as the outdated error is ignored only in the code path where there is no resize. I made a version where the outdated error is always ignored, and that seems to work perfectly. https://github.com/eero-lehtinen/bevy/tree/v0.13.0-nvidia-fix
Under arch linux, using nvidia driver version 550.54.14-4, with Xwayland.
Testing winit 0.29.14 examples, no issues.
Testing with wgpu 0.19.3 boids example, issue was reproducible.
[2024-03-09T04:53:35Z INFO winit::platform_impl::platform::x11::window] Guessed window scale factor: 1.5
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Initializing wgpu...
[2024-03-09T04:53:35Z INFO wgpu_core::instance] Adapter Vulkan AdapterInfo { name: "NVIDIA GeForce RTX 3060 Laptop GPU", vendor: 4318, device: 9504, device_type: DiscreteGpu, driver: "NVIDIA", driver_info: "550.54.14", backend: Vulkan }
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Using NVIDIA GeForce RTX 3060 Laptop GPU (Vulkan)
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Entering event loop...
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Surface resume PhysicalSize { width: 800, height: 600 }
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 800, height: 600 }
[2024-03-09T04:53:35Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 624, height: 1137 }
[2024-03-09T04:53:36Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 800, height: 600 }
[2024-03-09T04:53:36Z INFO wgpu_examples::framework] Frame time 6.05ms (165.4 FPS)
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Frame time 6.06ms (165.0 FPS)
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 801, height: 600 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 802, height: 600 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 802, height: 602 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 806, height: 605 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 807, height: 605 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 808, height: 606 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 808, height: 607 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 815, height: 613 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 818, height: 615 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 829, height: 621 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 832, height: 622 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 844, height: 628 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 847, height: 630 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 855, height: 636 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 863, height: 640 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 879, height: 651 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 882, height: 654 }
[2024-03-09T04:53:37Z INFO wgpu_examples::framework] Surface resize PhysicalSize { width: 885, height: 656 }
thread 'main' panicked at 'Failed to acquire next surface texture!: Outdated', examples/src/framework.rs:235:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
It seems to break when there are resizes in consecutive frames, as the outdated error is ignored only in the code path where there is no resize. I made a version where the outdated error is always ignored, and that seems to work perfectly. https://github.com/eero-lehtinen/bevy/tree/v0.13.0-nvidia-fix
Tried out your fix, works for me too in XWayland, nvidia drivers v550.54.14
It seems to break when there are resizes in consecutive frames, as the outdated error is ignored only in the code path where there is no resize. I made a version where the outdated error is always ignored, and that seems to work perfectly. https://github.com/eero-lehtinen/bevy/tree/v0.13.0-nvidia-fix
This appears to work for me too. Arch Linux 6.7.9-arch1-1, i3, X11, Nvidia 550.54.14.
Bevy version
Affects both 0.13 and main (21adeb684261eacd39acecaa8b056e0b9e918268)
Relevant system information
What you did
Run
cargo run --example window_resizing
or pretty much any example. Then drag a window corner to resize.What went wrong
Outputs this and crashes.
Additional information
550 was very recently stabilized (https://www.phoronix.com/news/NVIDIA-550.54.14-Linux-Driver).
Everything starts working again if I downgrade to the 545 driver.
I tested an example from the winit repository
cargo run --example window
and there was no issues.Other applications I use aren't affected by this issue, so it might not be Nvidia's fault.
Complete log of the bevy example with backtrace: