Open lemtea8 opened 2 months ago
@lemtea8 does vkcube
work for you? https://zed.dev/docs/linux#troubleshooting it sounds like a GPU issue
does
vkcube
work for you? https://zed.dev/docs/linux#troubleshooting it sounds like a GPU issue
Yes, like below:
My computer can also run Godot Engine (uses Vulkan 1.3) quite smoothly, which I think is a pretty complex Vulkan program.
After installing Zed from script, I got the zed.log.
Based on the info, I think the problem is that my system is set to nvidia GPU only, but Zed decides to use integrated graphics(which won't work), and eventually caused the crash.
@lemtea8 Can you try setting VULKAN_DEVICE_INDEX=X
to choose the correct GPU?
Would be good to know if there's something we/blade need to test for that we're not yet.
@lemtea8 could you try the following
cat /etc/prime-discrete
And
vkcube --gpu_number 0
vkcube --gpu_number 1
@ConradIrwin
Can you try setting VULKAN_DEVICE_INDEX=X to choose the correct GPU?
Both VULKAN_DEVICE_INDEX 0 and 1 crashed, it seems whatever I set it always uses integrated graphics.
Zed.log (upper part is VULKAN_DEVICE_INDEX=0 and lower part is 1):
@kvark There is no file which path is /etc/prime-discrete, as I uses envycontrol.
The output of envycontrol --query
is nvidia
.
vkcube --gpu_number 1
runs normally, it uses discrete GPU.
However, vkcube --gpu_number 0
generates a black window and nothing appears. After closing the window, the graphics driver seems crashed and need to reboot.
That's great, thank you! So you are affected by this platform issue (it's not a Blade implementation problem, strictly speaking), and Intel isn't able to present. I haven't heard of envycontrol, will look into it.
I'm still trying to understand what our options are:
@kvark Just so I understand it the problem is that the surface creation fails in a few different way (probably due to a platform issue on the Intel side?).
This can be fixed by:
There's obviously some trade-offs here, but my sense is:
So maybe a sensible plan would be:
I've implemented (3) in https://github.com/kvark/blade/pull/144 Need more data from running Blade and Zed on affected systems before we can consider this a proper fix.
I've pulled https://github.com/kvark/blade/pull/144 into zed main.
Can anyone seeing this issue try zed main and see if it has improved things?
I've pulled kvark/blade#144 into zed main.
Can anyone seeing this issue try zed main and see if it has improved things?
FWIW I think this update broke zed nighly on my ubuntu 24.04 with intel and nvidia graphics. Here is relevant contents of the log
2024-07-24T13:40:07.887128317+02:00 [INFO] ========== starting zed ==========
2024-07-24T13:40:08.039451662+02:00 [INFO] perform;
2024-07-24T13:40:08.039530387+02:00 [INFO] read_command;
2024-07-24T13:40:08.039683995+02:00 [INFO] read_command;
2024-07-24T13:40:08.039752146+02:00 [INFO] Opening main db
2024-07-24T13:40:08.040016554+02:00 [INFO] socket reader;
2024-07-24T13:40:08.040899818+02:00 [INFO] new;
2024-07-24T13:40:08.041824877+02:00 [INFO] keep_updated;
2024-07-24T13:40:08.058909588+02:00 [INFO] Using git binary path: None
2024-07-24T13:40:08.114844567+02:00 [ERROR] theme not found: Catppuccin Mocha (Blur)
2024-07-24T13:40:08.117272235+02:00 [INFO] extensions updated. loading 10, reloading 0, unloading 0
2024-07-24T13:40:08.122168924+02:00 [INFO] activate is not implemented on Linux, ignoring the call
2024-07-24T13:40:08.128470608+02:00 [INFO] Opening main db
2024-07-24T13:40:08.129020676+02:00 [INFO] perform;
2024-07-24T13:40:08.129097238+02:00 [INFO] read_command;
2024-07-24T13:40:08.129142818+02:00 [INFO] read_command;
2024-07-24T13:40:08.129280789+02:00 [INFO] socket reader;
2024-07-24T13:40:08.138700271+02:00 [INFO] new;
2024-07-24T13:40:08.139779953+02:00 [INFO] keep_updated;
2024-07-24T13:40:08.19678284+02:00 [INFO] Enabling Vulkan Portability
2024-07-24T13:40:08.196815659+02:00 [INFO] Enabling color space support
2024-07-24T13:40:08.22527139+02:00 [INFO] Testing presentation capability on Linux/Intel
@tepavcevic What happens after that in the logs?
It looks like it's working as expected up to that point
Reverting the change before preview out of an abundance of caution https://github.com/zed-industries/zed/pull/15095
@ConradIrwin nothing, that's the entire output. I'll try a clean install and give you an update on it. Update after install:
2024-07-24T17:33:26.703840756+02:00 [INFO] ========== starting zed ==========
2024-07-24T17:33:26.763189341+02:00 [INFO] perform;
2024-07-24T17:33:26.763285129+02:00 [INFO] read_command;
2024-07-24T17:33:26.763508477+02:00 [INFO] Opening main db
2024-07-24T17:33:26.763594316+02:00 [INFO] read_command;
2024-07-24T17:33:26.763765923+02:00 [INFO] socket reader;
2024-07-24T17:33:26.764684731+02:00 [INFO] new;
2024-07-24T17:33:26.765282301+02:00 [INFO] keep_updated;
2024-07-24T17:33:26.765826569+02:00 [INFO] Using git binary path: None
2024-07-24T17:33:26.801215613+02:00 [ERROR] theme not found: Catppuccin Mocha (Blur)
2024-07-24T17:33:26.802179142+02:00 [INFO] extensions updated. loading 10, reloading 0, unloading 0
2024-07-24T17:33:26.806281871+02:00 [INFO] activate is not implemented on Linux, ignoring the call
2024-07-24T17:33:26.806438441+02:00 [INFO] Opening main db
2024-07-24T17:33:26.808461174+02:00 [INFO] perform;
2024-07-24T17:33:26.808542943+02:00 [INFO] read_command;
2024-07-24T17:33:26.80869932+02:00 [INFO] read_command;
2024-07-24T17:33:26.808866055+02:00 [INFO] socket reader;
2024-07-24T17:33:26.814848709+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.814931632+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.814985427+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815117472+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815173786+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815226099+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815286654+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815339308+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.815391507+02:00 [WARN] request completed with error: failed to connect to the server
2024-07-24T17:33:26.821057336+02:00 [INFO] new;
2024-07-24T17:33:26.822284672+02:00 [INFO] keep_updated;
2024-07-24T17:33:26.870165241+02:00 [INFO] Enabling Vulkan Portability
2024-07-24T17:33:26.870206221+02:00 [INFO] Enabling color space support
2024-07-24T17:33:26.886203132+02:00 [INFO] Testing presentation capability on Linux/Intel
Thanks!
On Wed, Jul 24, 2024 at 9:32 AM, Djordje @.***> wrote:
@ConradIrwin https://github.com/ConradIrwin nothing, that's the entire output. I'll try a clean install and give you an update on it.
— Reply to this email directly, view it on GitHub https://github.com/zed-industries/zed/issues/14225#issuecomment-2248313876, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAXAQH56MQLX5VNHAKUNNLZN7CHDAVCNFSM6AAAAABKXLIYIWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBYGMYTGOBXGY . You are receiving this because you were mentioned.Message ID: @.***>
Hi all, I have run into this issue on Nixos,
❯ zed --version
Zed – /nix/store/0r0i6wyr1lp0ppws32f6q0qj6qr9xrav-zed-0.147.2/libexec/zed-editor
NVIDIA GeForce RTX 4070
NVIDIA-SMI 560.31.02 Driver Version: 560.31.02 CUDA Version: 12.6
dustin@evo
----------
OS: NixOS 24.11.20240814.c3aa7b8 (Vicuna) x86_64
Host: XPS 15 9530
Bios (UEFI): 1.7.0 (1.7)
Bootmgr: Linux Boot Manager - systemd-bootx64.efi
Board: 00KH17 (A00)
Chassis: Notebook
Kernel: Linux 6.6.45
Init System: systemd 256.4
Loadavg: 0.72, 0.89, 0.72
Processes: 414
Packages: 2657 (nix-system)
Shell: fish 3.7.1
Editor: nvim
Display (SDC414D): 3456x2160 @ 60 Hz (as 1728x1080) in 16″ [Built-in]
Brightness (SDC414D): 31%
Monitor (eDP-1): 3456x2160 px @ 59.99 Hz - 340x210 mm (15.73 inches, 259.04 ppi)
LM: gdm-password (Wayland)
DE: GNOME 46.4
WM: Mutter (Wayland)
Terminal: alacritty 0.13.2
CPU: 13th Gen Intel(R) Core(TM) i9-13900H (12+8) @ 5.40 GHz - 45.0°C
CPU Cache (L1): 6x48.00 KiB (D), 6x32.00 KiB (I), 8x32.00 KiB (D), 8x64.00 KiB (I)
CPU Cache (L2): 6x1.25 MiB (U), 2x2.00 MiB (U)
CPU Cache (L3): 24.00 MiB (U)
CPU Usage: 0%
GPU 1: Intel Iris Xe Graphics @ 1.50 GHz [Integrated]
GPU 2: NVIDIA GeForce RTX 4070 Max-Q / Mobile [Discrete]
Memory: 4.48 GiB / 31.03 GiB (14%)
Swap: Disabled
Disk (/): 129.02 GiB / 934.94 GiB (14%) - ext4
Battery: 56% [Discharging]
DNS: 100.100.100.100
Wifi: SERHIENKO - WPA2 (54%)
Date & Time: 2024-08-19 22:13:10
Locale: en_CA.UTF-8
Vulkan: 1.3.280 - Intel open-source Mesa driver [Mesa 24.1.5]NVIDIA [560.31.02]
OpenGL: 4.6 (Compatibility Profile) Mesa 24.1.5
Bluetooth Radio (evo): Bluetooth 5.3 (Intel)
Sound: Raptor Lake-P/U/H cAVS Speaker (82%)
Camera 1: Integrated_Webcam_HD: Integrate - sRGB (1280x720 px)
Camera 2: Integrated_Webcam_HD: Integrate - sRGB (640x360 px)
Network IO (wlp0s20f3): 245 B/s (IN) - 253 B/s (OUT) *
Disk IO (PC801 NVMe SK hynix 1TB): 8.34 MiB/s (R) - 3.87 MiB/s (W)
@tepavcevic could you make sure you have vulkan validation layers installed? I wonder if we can get a concrete errors when running with https://github.com/kvark/blade/pull/144. Alternatively, could you run it under gdb
and get a call stack? That would help as well, albeit less.
@bashfulrobot that looks like the same issue. We are choosing the first GPU, which is Intel, and it fails to create a vulkan surface.
Ok, great, I'll assume an upcoming patch will resolve it. Appreciate your time.
We have ideas on how to improve/solve this but currently lacking a good way to test them. If you can help, that would be great! You can start by checking out https://github.com/kvark/blade and doing cargo run --example bunnymark
and seeing if it's able to present.
I'm also getting a NV+Intel desktop mahcine, which I hope will reproduce this.
Sorry for a late reply, I've just cloned blade and checked out to intel present
branch. On compilation finished it said there was a segmentation fault
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1m 05s
Running `target/debug/examples/bunnymark`
[1] 10876 segmentation fault (core dumped) cargo run --example bunnymark
edit: I've made sure to have vulkan libs installed
@tepavcevic great! Do you have vulkan validation, too? Could you run the same thing under gdb to see the call stack and local variables? Also please share the logs produced with RUST_LOG=blade_graphics=debug
environment.
I am not well versed in this kind of debugging but I hope this helps @kvark
logs produced with debug env flag:
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.37s
Running `target/debug/examples/bunnymark`
[2024-08-25T16:28:24Z INFO blade_graphics::hal::init] Enabling Vulkan Portability
[2024-08-25T16:28:24Z INFO blade_graphics::hal::init] Enabling color space support
[2024-08-25T16:28:24Z INFO blade_graphics::hal::init] Testing presentation capability on Linux/Intel
VUID-VkSwapchainCreateInfoKHR-presentMode-01281(ERROR / SPEC): msgNum: -1378015611 - Validation Error: [ VUID-VkSwapchainCreateInfoKHR-presentMode-01281 ] | MessageID = 0xaddd2685 | vkCreateSwapchainKHR(): pCreateInfo->presentMode (VK_PRESENT_MODE_IMMEDIATE_KHR) is not supported (the following are supported VK_PRESENT_MODE_MAILBOX_KHR VK_PRESENT_MODE_FIFO_KHR ). The Vulkan spec states: presentMode must be one of the VkPresentModeKHR values returned by vkGetPhysicalDeviceSurfacePresentModesKHR for the surface (https://vulkan.lunarg.com/doc/view/1.3.290.0/linux/1.3-extensions/vkspec.html#VUID-VkSwapchainCreateInfoKHR-presentMode-01281)
Objects: 0
[1] 13335 segmentation fault (core dumped) cargo run --example bunnymark
And for gdb
I've gotten this output:
That's very helpful, thank you @tepavcevic ! I've just pushed an update to "intel present" branch (commit fee06c42f658b36dd9ac85444a9ee2a481383695), and it should work now. If you could check it on your side, that would be great!
@kvark I've tested it now, was able to build and run it. It also produces a lot of logs, I saw three distinct messages:
If you need me to test something in the future on this setup, feel free to ping me.
Ok, sounds great, thank you! The validation error about "VkMemoryBarrier" I'll fix quickly, but overall this appears to work. @ConradIrwin would you be interested to give it a second shot for Zed users? The "intel-present" branch is updated on latest and appears to be working now.
In another thread @edwloef mentioned:
This is a bug with the Nvidia 560 drivers. Going back to the 555 drivers or running Zed with Xwayland, or using a compositor that doesn't support explicit sync, resolves the issue for me.
In another thread @edwloef mentioned:
This is a bug with the Nvidia 560 drivers. Going back to the 555 drivers or running Zed with Xwayland, or using a compositor that doesn't support explicit sync, resolves the issue for me.
This thread isn't the same issue, the 560 issue is as far as I can tell only the Protocol error 0 on object wp_linux_drm_syncobj_manager_v1
spam, which can only occur on wayland, which the original reporter isn't using.
@kvark it's early days, but so far on v0.152 the crashes seem similar. Is there something in particular I should check on?
Since release we've seen:
4
called `Result::unwrap()` on an `Err` value: ERROR_SURFACE_LOST_KHR | core::panicking::panic_fmt | c
core::panicking::panic_fmt
core::result::unwrap_failed
blade_graphics::hal::init::<impl blade_graphics::hal::Context>::resize
gpui::platform::blade::blade_renderer::BladeRenderer::new
gpui::platform::linux::wayland::window::WaylandWindow::new
<gpui::platform::linux::wayland::client::WaylandClient as gpui::platform::linux::platform::LinuxClient>::open_window
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::open_window
gpui::window::Window::new
workspace::Workspace::new_local::{{closure}}::{{closure}}
async_task::raw::RawTask<F,T,S,M>::run
<gpui::platform::linux::wayland::client::WaylandClient as gpui::platform::linux::platform::LinuxClient>::run
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run
gpui::app::App::run
2
called `Result::unwrap()` on an `Err` value: ERROR_INITIALIZATION_FAILED | core::panicking::panic_fm core::panicking::panic_fmt
core::result::unwrap_failed
blade_graphics::hal::init::<impl blade_graphics::hal::Context>::resize
gpui::platform::blade::blade_renderer::BladeRenderer::new
<gpui::platform::linux::x11::client::X11Client as gpui::platform::linux::platform::LinuxClient>::open_window
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::open_window
gpui::window::Window::new
workspace::Workspace::new_local::{{closure}}::{{closure}}
async_task::raw::RawTask<F,T,S,M>::run
<gpui::platform::linux::x11::client::X11Client as gpui::platform::linux::platform::LinuxClient>::run
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run
gpui::app::App::run
zed::main
std::sys_common::backtrace::__rust_begin_short_backtrace
std::rt::lang_start::{{closure}}
std::rt::lang_start_internal
main
__libc_start_main
_start
2
called `Result::unwrap()` on an `Err` value: ERROR_SURFACE_LOST_KHR | core::panicking::panic_fmt | c core::panicking::panic_fmt
core::result::unwrap_failed
blade_graphics::hal::init::<impl blade_graphics::hal::Context>::resize
gpui::platform::linux::wayland::window::WaylandWindowStatePtr::set_size_and_scale
<gpui::platform::linux::wayland::client::WaylandClientStatePtr as wayland_client::event_queue::Dispatch<wayland_protocols::wp::fractional_scale::v1::generated::client::wp_fractional_scale_v1::WpFractionalScaleV1,wayland_backend::sys::client::ObjectId>>::event
wayland_client::event_queue::queue_callback
<core::cell::RefCell<calloop::sources::DispatcherInner<S,F>> as calloop::sources::EventDispatcher<Data>>::process_events
<gpui::platform::linux::wayland::client::WaylandClient as gpui::platform::linux::platform::LinuxClient>::run
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run
gpui::app::App::run
zed::main
std::sys_common::backtrace::__rust_begin_short_backtrace
std::rt::lang_start::{{closure}}
std::rt::lang_start_internal
main
__libc_start_call_main
__libc_start_main_alias_1
_start
2
Unexpected descriptor allocation error: ERROR_OUT_OF_DEVICE_MEMORY | core::panicking::panic_fmt | bl core::panicking::panic_fmt
blade_graphics::hal::descriptor::<impl blade_graphics::hal::Device>::allocate_descriptor_set
gpui::platform::blade::blade_renderer::BladeRenderer::draw
<gpui::platform::linux::x11::window::X11Window as gpui::platform::PlatformWindow>::draw
gpui::window::Window::new::{{closure}}::{{closure}}
gpui::window::Window::new::{{closure}}
<core::cell::RefCell<calloop::sources::DispatcherInner<S,F>> as calloop::sources::EventDispatcher<Data>>::process_events
<gpui::platform::linux::x11::client::X11Client as gpui::platform::linux::platform::LinuxClient>::run
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run
gpui::app::App::run
zed::main
std::sys_common::backtrace::__rust_begin_short_backtrace
std::rt::lang_start::{{closure}}
std::rt::lang_start_internal
main
__libc_start_main
_start
1
Aquire image error ERROR_SURFACE_LOST_KHR | core::panicking::panic_fmt | blade_graphics::hal::init:: core::panicking::panic_fmt
blade_graphics::hal::init::<impl blade_graphics::hal::Context>::acquire_frame
gpui::platform::blade::blade_renderer::BladeRenderer::draw
<gpui::platform::linux::x11::window::X11Window as gpui::platform::PlatformWindow>::draw
gpui::window::Window::new::{{closure}}::{{closure}}
gpui::window::Window::new::{{closure}}
<core::cell::RefCell<calloop::sources::DispatcherInner<S,F>> as calloop::sources::EventDispatcher<Data>>::process_events
<gpui::platform::linux::x11::client::X11Client as gpui::platform::linux::platform::LinuxClient>::run
gpui::platform::linux::platform::<impl gpui::platform::Platform for P>::run
gpui::app::App::run
zed::main
std::sys_common::backtrace::__rust_begin_short_backtrace
std::rt::lang_start::{{closure}}
std::rt::lang_start_internal
main
__libc_start_call_main
__libc_start_main_impl
_start
It's important to have the logs associated with these stack traces. In particular, I'd need to know:
On Void Linux, installing mesa-vulkan-intel fixes it for me (Intel CPU).
Check for existing issues
Describe the bug / provide steps to reproduce it
Zed instantly crashes on KDE plasma. Message from
zed --foreground
(There's also XIMClientError but I assume it didn't cause the crash):Environment
Operating System: Fedora Linux 38 KDE Plasma Version: 5.27.11 KDE Frameworks Version: 5.115.0 Qt Version: 5.15.12 Kernel Version: 6.8.9-100.fc38.x86_64 (64-bit) Graphics Platform: X11 Graphics Processor: NVIDIA GeForce GTX 1660 Ti/PCIe/SSE2
If applicable, attach your
~/Library/Logs/Zed/Zed.log
file to this issue.There is no Zed.log found.