LizardByte / Sunshine

Self-hosted game stream host for Moonlight.
http://app.lizardbyte.dev/Sunshine/
GNU General Public License v3.0
17.97k stars 868 forks source link

Random segmentation fault on start #2944

Open Sidefix opened 1 month ago

Sidefix commented 1 month ago

Is there an existing issue for this?

Is your issue described in the documentation?

Is your issue present in the latest beta/pre-release?

This issue is present in the latest pre-release

Describe the Bug

Randomly when starting up Sunshine, it will segfault while reading the configuration file. I've observed this issue only occurs on a second+ startup of Sunshine after a machine reboot. Meaning that Sunshine first startup after a fresh reboot will never cause this to occur.

Expected Behavior

Sunshine should start up every time.

Additional Context

This has been occurring since before 0.21.0 (this was the first version I ever used). This has occurred across multiple versions of my OS (Ubuntu 22.10, Ubuntu 23.04, Ubuntu 23.10 and now Ubuntu 24.04). This has occurred across multiple versions of graphic drivers.

No configuration changes occur in between a failing startup and a successful startup.

The issue occurs randomly, but relatively rarely. 3 out of 10 startups will segfault.

A retry of the startup will almost always work immediately. Only a handful of times has the startup segfault persisted a second time consecutively.

Host Operating System

Linux

Operating System Version

24.04

Architecture

64 bit

Sunshine commit or version

v2024.730.191523

Package

Linux - deb

GPU Type

Nvidia

GPU Model

RTXA2000

GPU Driver/Mesa Version

560.28.03

Capture Method

NvFBC (Linux)

Config

log_path = /home/pi/sunshine/configs/sunshine.log
nv_preset = p1
origin_web_ui_allowed = pc
credentials_file = /home/pi/sunshine/configs/sunshine_state.json
nvenc_spatial_aq = enabled
file_apps = /home/pi/sunshine/configs/apps.json
resolutions = [
    1440x900
]
min_log_level = 1
file_state = /home/pi/sunshine/configs/sunshine_state.json
encoder = nvenc
nvenc_preset = 7
fps = [60]
nv_rc = vbr
capture = nvfbc
native_pen_touch = disabled
channels = 3
sunshine_name = Pi
global_prep_cmd = [{"do":"","undo":""}]
high_resolution_scrolling = disabled
nvenc_twopass = full_res

Apps

No response

Relevant log output

pi@pi:~/sunshine/configs$ sunshine sunshine.conf
[nvenc_twopass] -- [full_res]
[high_resolution_scrolling] -- [disabled]
[global_prep_cmd] -- [[{"do":"","undo":""}]]
[sunshine_name] -- [Pi]
[channels] -- [3]
[native_pen_touch] -- [disabled]
[capture] -- [nvfbc]
[log_path] -- [/home/pi/sunshine/configs/sunshine.log]
[nv_preset] -- [p1]
[min_log_level] -- [1]
[origin_web_ui_allowed] -- [pc]
[credentials_file] -- [/home/pi/sunshine/configs/sunshine_state.json]
[nvenc_spatial_aq] -- [enabled]
[file_apps] -- [/home/pi/sunshine/configs/apps.json]
[resolutions] -- [[
    1440x900
]]
[file_state] -- [/home/pi/sunshine/configs/sunshine_state.json]
[encoder] -- [nvenc]
[nvenc_preset] -- [7]
[fps] -- [[60]]
[nv_rc] -- [vbr]
Warning: Unrecognized configurable option [nv_preset]
Warning: Unrecognized configurable option [nv_rc]
[2024:08:02:12:24:13]: Info: Sunshine version: v2024.730.191523
Segmentation fault (core dumped)
Sidefix commented 1 month ago

Some additional context: I do end up killing the sunshine process with some regularity for testing a separate issue I have: #2614

ReenigneArcher commented 1 month ago

Some additional context: I do end up killing the sunshine process with some regularity

I think the config file is locked by the other process and when you kill it, it doesn't release immediately?

Sidefix commented 1 month ago

@ReenigneArcher I've had it occur even with a long break between killing the original process and starting a new instance. I don't do it programatically, I do it manually.

If I do end up killing the original process, I usually do so from the terminal with kill -9 or ctrl+C from the running window.

I.e. I don't think the previous process still has any hooks that could cause this when I usually restart. I'll keep that in mind in my repros though.

Edit: sorry if it's a bit confusing; what I'm trying to say effectively is that I kill the process in such a way that it shouldn't still have the file locked regardless, and I usually have enough of a delta time between killing the old one and starting the new one that it doesn't really make sense to me.

cgutman commented 1 month ago

Please try to run under gdb and get a backtrace of the crash.

Sidefix commented 1 month ago

@cgutman here you go, I hope I did this correctly, I've never used gdb before:

Thread 4 "sunshine" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x731cec800000 (LWP 80512)]
__GI_getenv (name=0x731cee1ae017 "XAUTHORITY") at ./stdlib/getenv.c:31
warning: 31 ./stdlib/getenv.c: No such file or directory
(gdb) backtrace
#0  __GI_getenv (name=0x731cee1ae017 "XAUTHORITY") at ./stdlib/getenv.c:31
#1  0x0000731cee1ac6fb in XauFileName () from /lib/x86_64-linux-gnu/libXau.so.6
#2  0x0000731cee1acd56 in XauGetBestAuthByAddr ()
   from /lib/x86_64-linux-gnu/libXau.so.6
#3  0x0000731cee728d1d in xcb_connect_to_display_with_auth_info ()
   from /lib/x86_64-linux-gnu/libxcb.so.1
#4  0x0000731cf0f223ca in _XConnectXCB () from /lib/x86_64-linux-gnu/libX11.so.6
#5  0x0000731cf0f130fe in XOpenDisplay () from /lib/x86_64-linux-gnu/libX11.so.6
#6  0x0000731cf1776ebc in ?? () from /lib/x86_64-linux-gnu/libgdk-3.so.0
#7  0x0000731cf1721397 in gdk_display_manager_open_display ()
   from /lib/x86_64-linux-gnu/libgdk-3.so.0
#8  0x0000731cf2e0238a in gtk_init_check () from /lib/x86_64-linux-gnu/libgtk-3.so.0
#9  0x00005dcaf3d54b87 in ?? ()
#10 0x00005dcaf3d03ece in ?? ()
#11 0x0000731cf1ceabb4 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#12 0x0000731cf189ca94 in start_thread (arg=<optimized out>)
    at ./nptl/pthread_create.c:447
#13 0x0000731cf1929c3c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
(gdb)