setzer22 / blackjack

A procedural, node-based modelling tool, made in rust 🦀
Mozilla Public License 2.0
1.42k stars 64 forks source link

Use default GPU / Segfault when starting #76

Closed edgarogh closed 1 year ago

edgarogh commented 1 year ago

Hello,

I have a Linux Mint 20.3 laptop (from 2014) with the following integrated & discrete GPUs:

Graphics:  Device-1: Intel 4th Gen Core Processor Integrated Graphics vendor: Micro-Star MSI 
           driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:0416 
           Device-2: NVIDIA GK106M [GeForce GTX 765M] vendor: Micro-Star MSI driver: nvidia 
           v: 390.157 bus ID: 01:00.0 chip ID: 10de:11e2 
           Display: x11 server: X.Org 1.20.13 driver: modesetting,nvidia 
           unloaded: fbdev,nouveau,vesa resolution: 1920x1080~60Hz 
           OpenGL: renderer: GeForce GTX 765M/PCIe/SSE2 v: 4.6.0 NVIDIA 390.157 direct render: Yes 

The nvidia discrete GPU is clearly the default one as confirmed by the nvidia prime control panel and all games starting with it. I also just checked on the Linux Mint forums and it looks like nothing wrong with my OS setup or GPU config.

However, when starting blackjack, it segfaults immediately with the following error: MESA-INTEL: warning: Haswell Vulkan support is incomplete. Haswell is an Intel thing, so I'm guessing blackjack starts using my CPU's integrated graphics.

But the weirdest about that is that the backtrace (from gdb) mentions an nvidia shared object in the last stack frames before crash:

gdb log ``` (gdb) run Starting program: ./target/debug/blackjack_ui [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". MESA-INTEL: warning: Haswell Vulkan support is incomplete [New Thread 0x7fffea928700 (LWP 5108)] Thread 1 "blackjack_ui" received signal SIGSEGV, Segmentation fault. 0x00007ffff3cfadf9 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 (gdb) bt #0 0x00007ffff3cfadf9 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #1 0x00007ffff3cfbc45 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #2 0x00007ffff3d004df in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #3 0x00007ffff3da4cd9 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #4 0x00007ffff3daee1d in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #5 0x00007ffff3daefef in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #6 0x00007ffff3daf1b2 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #7 0x00007ffff3daf34a in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #8 0x00007ffff3cafaf7 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #9 0x00007ffff3cb03c7 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #10 0x00007ffff47e12a6 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #11 0x00007ffff47f3b61 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #12 0x00007ffff47f62fa in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #13 0x00007ffff4d82627 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #14 0x00007ffff4d82a08 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #15 0x00007ffff4d83bdd in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #16 0x00007ffff4d83c91 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #17 0x00007ffff4d87431 in ?? () from /lib/x86_64-linux-gnu/libnvidia-glcore.so.390.157 #18 0x0000555556978038 in ash::device::Device::create_graphics_pipelines (self=0x555557823ff0, pipeline_cache=..., create_infos=..., allocation_callbacks=...) at src/device.rs:1963 #19 0x0000555556920a9a in wgpu_hal::vulkan::device:: for wgpu_hal::vulkan::Device>::create_render_pipeline (self=0x555557cfd750, desc=0x7fffffff3ca8) at src/vulkan/device.rs:1660 #20 0x000055555680d05d in wgpu_core::device::Device::create_render_pipeline (self=0x555557cfd750, self_id=..., adapter=, desc=0x7fffffff4e30, implicit_context=..., hub=, token=) at /home/edgar/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-core-0.13.2/src/device/mod.rs:2839 #21 0x00005555568993f3 in wgpu_core::device::>::device_create_render_pipeline (self=0x55555780ac60, device_id=..., desc=0x0, id_in=..., implicit_pipeline_ids=...) at /home/edgar/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-core-0.13.2/src/device/mod.rs:4704 #22 0x000055555682e701 in ::device_create_render_pipeline (self=0x55555780ac60, device=0x555557827fe8, desc=) at src/backend/direct.rs:1389 #23 0x0000555556796d3a in wgpu::Device::create_render_pipeline (self=, desc=0x7fffffff1500) at src/lib.rs:2110 #24 0x0000555556644a4b in rend3_routine::depth::create_depth_inner (renderer=0x555557ebc500, samples=rend3_types::SampleCount::One, ty=rend3_routine::depth::DepthPassType::Shadow, pll=0x7fffffff5408, vert=0x7fffffff5380, frag=0x7fffffff53e8, name=..., unclipped_depth_supported=) at src/depth.rs:474 #25 rend3_routine::depth::DepthPipelines::new::{{closure}} (name=..., ty=rend3_routine::depth::DepthPassType::Shadow, pll=0x7fffffff5408, frag=0x7fffffff53e8, samples=rend3_types::SampleCount::One) at src/depth.rs:389 #26 rend3_routine::depth::DepthPipelines::new (renderer=0x555557ebc500, data_core=, interfaces=, per_material_bgl=0x7fffffff5540, abi_bgl=, unclipped_depth_supported=false) at src/depth.rs:410 #27 rend3_routine::depth::DepthRoutine::new (renderer=0x555557ebc500, data_core=, interfaces=, per_material=0x7fffffff5540, unclipped_depth_supported=false) at src/depth.rs:151 #28 0x000055555666c8b7 in rend3_routine::pbr::routine::PbrRoutine::new (renderer=0x555557ebc500, data_core=0x555557ebc510, interfaces=0x7fffffff5cd8) at src/pbr/routine.rs:33 #29 0x000055555590c1ac in blackjack_ui::render_context::RenderContext::new (window=0x7fffffff66f8) at blackjack_ui/src/render_context.rs:82 #30 0x0000555555843b21 in blackjack_ui::app_window::AppWindow::new () at blackjack_ui/src/app_window.rs:42 #31 0x00005555557d25eb in blackjack_ui::main () at blackjack_ui/src/main.rs:52 ```

I'm sorry if this is an XY problem. I have no idea if the issue has to do with Haswell or nvidia due to both being mentioned before crash. In the first case, is it normal that my integrated GPU is used when it isn't the default. In the second case, how can I help you diagnose the crash?

Quick personal research

setzer22 commented 1 year ago

Thanks for reporting! AFAIK blackjack should already be selecting the preferred GPU, the detection logic is based on rend3's create_iad function: https://docs.rs/rend3/latest/rend3/fn.create_iad.html

But this is something I haven't done a lot of testing with, so there might be an issue in that logic. It's been a while since I've last used an Nvidia Optimus system but I have an old laptop laying around that I should be able to use to debug this :smile:

edgarogh commented 1 year ago

After a bit of reflection, I also made the hypothesis that MESA-INTEL: warning: Haswell Vulkan support is incomplete is just printed while GPUs are iterated, but that ultimately it's the nvidia GPU that is chosen and causes the error. Which would probably imply that the segfault I have caught in GDB is a nvidia driver error that blackjack has nothing to do with.

Again, if I can do anything to prove/disprove any of these claims, I'd gladly help. I could try to narrow the problem down to a single crate (it's likely that blackjack itself isn't the root cause) but this would take time; I'll do it when I'm bored.

edgarogh commented 1 year ago

I don't know what changed but it looks like I'm able to launch the latest version. It's a bit unstable (i.e. it crashes sometimes) but at least the GPU issue seems to be fixed. I though about this because I now have the same issue I had here with inlyne. We'll get there! It's probably just a matter of updating deps on their side.

edgarogh commented 1 year ago

Also, I should add that by following this comment, I managed to get rid of the Haswell warning. It definitely looks like this was just printed while iterating over GPUs as the fix I linked supposedly forces the discrete GPU to be the only one enumerable.