LizardByte / Sunshine

Self-hosted game stream host for Moonlight.
http://app.lizardbyte.dev/Sunshine/
GNU General Public License v3.0
19.59k stars 951 forks source link

[NvFBC] Sunshine fails to load video session after 4th attempt. Error 503/-1 #2974

Closed Sidefix closed 2 months ago

Sidefix commented 3 months ago

Is there an existing issue for this?

Is your issue described in the documentation?

Is your issue present in the latest beta/pre-release?

This issue is present in the latest pre-release

Describe the Bug

Almost with 100% reliability, sessions will stop working after the 4th.

For example, I initiate a new session by starting a game (or resuming on an already running game). Then if I close the session (CTRL+ALT+SHIFT+Q also works, you don't need to KILL the session) and resume/start a new one, repeat up to 4 times, the 4th time will invariably fail to start.

The error will be either a 503 or a -1 error on the client.

Once the issue occurs one, it will only be fixed by restarting the Sunshine process. Any number of reconnections will persist the issue.

Expected Behavior

Sunshine should be able to start/resume an indefinite amount of sessions

(This is NOT related to parallel sessions enabled by the Channels setting; this is strictly single sessions)

Additional Context

The clients are latest Moonlight. Occurs for both Android and PC. Issue occurs with any app running. I believe, based on the logs, that the issue is somehow related to starting the video output.

Sunshine host is up to date Ubuntu 24.04 with latest Nvidia drivers.

This has been occurring since before 0.21.0 (this was the first version I ever used). This has occurred across multiple versions of my OS (Ubuntu 22.10, Ubuntu 23.04, Ubuntu 23.10 and now Ubuntu 24.04). This has occurred across multiple versions of graphic drivers.

Host Operating System

Linux

Operating System Version

24.04

Architecture

64 bit

Sunshine commit or version

v2024.805.184417

Package

Linux - deb

GPU Type

Nvidia

GPU Model

RTXA2000

GPU Driver/Mesa Version

560.31.02

Capture Method

NvFBC (Linux)

Config

log_path = /home/pi/sunshine/configs/sunshine.log
nv_preset = p1
origin_web_ui_allowed = pc
credentials_file = /home/pi/sunshine/configs/sunshine_state.json
nvenc_spatial_aq = enabled
file_apps = /home/pi/sunshine/configs/apps.json
resolutions = [
    1440x900
]
min_log_level = 1
file_state = /home/pi/sunshine/configs/sunshine_state.json
encoder = nvenc
nvenc_preset = 7
fps = [60]
nv_rc = vbr
capture = nvfbc
native_pen_touch = disabled
channels = 3
sunshine_name = Pi
global_prep_cmd = [{"do":"","undo":""}]
high_resolution_scrolling = disabled
nvenc_twopass = full_res

Apps

No response

Relevant log output

[2024:08:07:10:13:21]: Debug: Initialized new control stream session by connect data match [v2]
[2024:08:07:10:13:21]: Debug: Control local address [192.168.0.107]
[2024:08:07:10:13:21]: Debug: Control peer address [192.168.0.215:27096]
[2024:08:07:10:13:21]: Info: CLIENT CONNECTED
[2024:08:07:10:13:21]: Debug: type [IDX_REQUEST_IDR_FRAME]
[2024:08:07:10:13:21]: Debug: type [IDX_START_B]
[2024:08:07:10:13:21]: Debug: System tray loop
[2024:08:07:10:13:21]: Debug: RAISE: 192.168.0.215:55402 :: VIDEO
[2024:08:07:10:13:21]: Debug: Received ping [v2] from 192.168.0.215:55402 [0100000042344534374444323846323841453332]
[2024:08:07:10:13:21]: Debug: Start capturing Video
[2024:08:07:10:13:21]: Debug: System tray loop
[2024:08:07:10:13:21]: Debug: --begin relative mouse move packet--
deltaX [1]
deltaY [1]
--end relative mouse move packet--
[2024:08:07:10:13:21]: Debug: --begin relative mouse move packet--
deltaX [-1]
deltaY [-1]
--end relative mouse move packet--
[2024:08:07:10:13:21]: Info: Found [1] outputs
[2024:08:07:10:13:21]: Info: Virtual Desktop: 1440x900
[2024:08:07:10:13:21]: Info: XrandR: available
[2024:08:07:10:13:21]: Info: -- Output --
[2024:08:07:10:13:21]: Debug:   ID: 540
[2024:08:07:10:13:21]: Debug:   Name: DP-0
[2024:08:07:10:13:21]: Info:   Resolution: 1440x900
[2024:08:07:10:13:21]: Info:   Offset: 0x0
[2024:08:07:10:13:21]: Info: Screencasting with NvFBC
[2024:08:07:10:13:21]: Info: SDR color coding [Rec. 601]
[2024:08:07:10:13:21]: Info: Color depth: 8-bit
[2024:08:07:10:13:21]: Info: Color range: [MPEG]
[2024:08:07:10:13:21]: Debug: [hevc_nvenc @ 0x71f71c043400] Loaded Nvenc version 12.2
[2024:08:07:10:13:21]: Debug: [hevc_nvenc @ 0x71f71c043400] Nvenc initialized successfully
[2024:08:07:10:13:21]: Debug: Sent HDR mode: false
[2024:08:07:10:13:21]: Debug: RAISE: 192.168.0.215:55400 :: AUDIO
[2024:08:07:10:13:21]: Debug: Received ping [v2] from 192.168.0.215:55400 [0200000042344534374444323846323841453332]
[2024:08:07:10:13:21]: Debug: Start capturing Audio
[2024:08:07:10:13:21]: Debug: Connecting to pulseaudio
[2024:08:07:10:13:21]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-stereo format=float channels=2 channel_map=front-left,front-right sink_properties=device.description=sink-sunshine-stereo
[2024:08:07:10:13:21]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-surround51 format=float channels=6 channel_map=front-left,front-right,front-center,lfe,rear-left,rear-right sink_properties=device.description=sink-sunshine-surround51
[2024:08:07:10:13:21]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-surround71 format=float channels=8 channel_map=front-left,front-right,front-center,lfe,rear-left,rear-right,side-left,side-right sink_properties=device.description=sink-sunshine-surround71
[2024:08:07:10:13:21]: Info: Setting default sink to: [sink-sunshine-stereo]
[2024:08:07:10:13:21]: Info: Found default monitor by name: sink-sunshine-stereo.monitor
[2024:08:07:10:13:21]: Info: Opus initialized: 48 kHz, 2 channels, 512 kbps (total), LOWDELAY
[2024:08:07:10:13:21]: Debug: Minimum frame time set to 100ms, based on min fps factor of 1.
[2024:08:07:10:13:21]: Debug: Frame 1: IDR Keyframe (AV_FRAME_FLAG_KEY)
[2024:08:07:10:13:21]: Debug: type [IDX_REQUEST_IDR_FRAME]
[2024:08:07:10:13:21]: Debug: Frame 4: IDR Keyframe (AV_FRAME_FLAG_KEY)
[2024:08:07:10:13:28]: Debug: --begin mouse button packet--
action [00000008]
button [01]
--end mouse button packet--
[2024:08:07:10:13:28]: Debug: --begin mouse button packet--
action [00000009]
button [01]
--end mouse button packet--
[2024:08:07:10:13:29]: Debug: --begin relative mouse move packet--
deltaX [5]
deltaY [2]
--end relative mouse move packet--
[2024:08:07:10:13:30]: Debug: --begin mouse button packet--
action [00000009]
button [01]
--end mouse button packet--
[2024:08:07:10:13:30]: Info: App exited with code [0]
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Info: Process terminated
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Info: Setting default sink to: [auto_null]
[2024:08:07:10:13:30]: Error: Couldn't set default-sink [auto_null]: No such entity
[2024:08:07:10:13:30]: Debug: Pulseadio context terminated
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:30]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: [hevc_nvenc @ 0x71f71c043400] Nvenc unloaded
[2024:08:07:10:13:31]: Error: Couldn't release NvFBC context from current thread: 
[2024:08:07:10:13:31]: Debug: Waiting for video to end...
[2024:08:07:10:13:31]: Debug: Waiting for audio to end...
[2024:08:07:10:13:31]: Debug: Waiting for control to end...
[2024:08:07:10:13:31]: Debug: Resetting Input...
[2024:08:07:10:13:31]: Debug: Session ended
[2024:08:07:10:13:31]: Debug: Waiting for main listening thread to end...
[2024:08:07:10:13:31]: Debug: Waiting for main video thread to end...
[2024:08:07:10:13:31]: Debug: Waiting for main audio thread to end...
[2024:08:07:10:13:31]: Debug: Waiting for main control thread to end...
[2024:08:07:10:13:31]: Debug: All broadcasting threads ended
[2024:08:07:10:13:31]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: System tray loop
[2024:08:07:10:13:31]: Debug: TUNNEL :: NONE
[2024:08:07:10:13:31]: Debug: METHOD :: GET
[2024:08:07:10:13:31]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:31]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:31]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:31]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:31]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:31]: Debug: Host -- 192.168.0.107:1612
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:31]: Debug: uuid -- 47727e6176b344a4895237a5db3bb83d
[2024:08:07:10:13:31]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:31]: Debug: /CN=NVIDIA GameStream Client -- verified
[2024:08:07:10:13:31]: Debug: TUNNEL :: HTTPS
[2024:08:07:10:13:31]: Debug: METHOD :: GET
[2024:08:07:10:13:31]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:31]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:31]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:31]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:31]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:31]: Debug: Host -- 192.168.0.107:47984
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:31]: Debug: uuid -- ad4558ee33574d0799ea1660b2a8de45
[2024:08:07:10:13:31]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:31]: Debug: /CN=NVIDIA GameStream Client -- verified
[2024:08:07:10:13:31]: Debug: TUNNEL :: HTTPS
[2024:08:07:10:13:31]: Debug: METHOD :: GET
[2024:08:07:10:13:31]: Debug: DESTINATION :: /applist
[2024:08:07:10:13:31]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:31]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:31]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:31]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:31]: Debug: Host -- 192.168.0.107:47984
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:31]: Debug: uuid -- 2f2c4a88d53e4524862daa1db7412d84
[2024:08:07:10:13:31]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:31]: Debug:  [--] 
[2024:08:07:10:13:34]: Debug: TUNNEL :: NONE
[2024:08:07:10:13:34]: Debug: METHOD :: GET
[2024:08:07:10:13:34]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:34]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:34]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:34]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:34]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:34]: Debug: Host -- 192.168.0.107:1612
[2024:08:07:10:13:34]: Debug:  [--] 
[2024:08:07:10:13:34]: Debug: uuid -- 0acabca533a24409ad0a2cd3555744ad
[2024:08:07:10:13:34]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:34]: Debug:  [--] 
[2024:08:07:10:13:34]: Debug: /CN=NVIDIA GameStream Client -- verified
[2024:08:07:10:13:34]: Debug: TUNNEL :: HTTPS
[2024:08:07:10:13:34]: Debug: METHOD :: GET
[2024:08:07:10:13:34]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:34]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:34]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:34]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:34]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:34]: Debug: Host -- 192.168.0.107:47984
[2024:08:07:10:13:34]: Debug:  [--] 
[2024:08:07:10:13:34]: Debug: uuid -- 4ebdf66349684cb9957464d5731f82e0
[2024:08:07:10:13:34]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:34]: Debug:  [--] 
[2024:08:07:10:13:37]: Debug: /CN=NVIDIA GameStream Client -- verified
[2024:08:07:10:13:37]: Debug: TUNNEL :: HTTPS
[2024:08:07:10:13:37]: Debug: METHOD :: GET
[2024:08:07:10:13:37]: Debug: DESTINATION :: /launch
[2024:08:07:10:13:37]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:37]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:37]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:37]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:37]: Debug: Host -- 192.168.0.107:47984
[2024:08:07:10:13:37]: Debug:  [--] 
[2024:08:07:10:13:37]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:37]: Debug: additionalStates -- 1
[2024:08:07:10:13:37]: Debug: uuid -- bf823982037d43be9e50fde00a5fda99
[2024:08:07:10:13:37]: Debug: appid -- 348528904
[2024:08:07:10:13:37]: Debug: mode -- 3240x2160x60
[2024:08:07:10:13:37]: Debug: sops -- 0
[2024:08:07:10:13:37]: Debug: rikey -- db66eaf5429a2b2ad961fd3f4cf2fb46
[2024:08:07:10:13:37]: Debug: surroundAudioInfo -- 196610
[2024:08:07:10:13:37]: Debug: rikeyid -- -1843073448
[2024:08:07:10:13:37]: Debug: remoteControllersBitmap -- 0
[2024:08:07:10:13:37]: Debug: corever -- 1
[2024:08:07:10:13:37]: Debug: gcmap -- 0
[2024:08:07:10:13:37]: Debug: localAudioPlayMode -- 0
[2024:08:07:10:13:37]: Debug: gcpersist -- 0
[2024:08:07:10:13:37]: Debug:  [--] 
[2024:08:07:10:13:37]: Info: Trying encoder [nvenc]
[2024:08:07:10:13:37]: Info: Screencasting with NvFBC
[2024:08:07:10:13:38]: Debug: TUNNEL :: NONE
[2024:08:07:10:13:38]: Debug: METHOD :: GET
[2024:08:07:10:13:38]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:38]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:38]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:38]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:38]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:38]: Debug: Host -- 192.168.0.107:1612
[2024:08:07:10:13:38]: Debug:  [--] 
[2024:08:07:10:13:38]: Debug: uuid -- 9804b443fcaf4b68a780f91796b466ad
[2024:08:07:10:13:38]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:38]: Debug:  [--] 
[2024:08:07:10:13:38]: Info: SDR color coding [Rec. 601]
[2024:08:07:10:13:38]: Info: Color depth: 8-bit
[2024:08:07:10:13:38]: Info: Color range: [JPEG]
[2024:08:07:10:13:38]: Debug: [h264_nvenc @ 0x71f7187a6a80] Loaded Nvenc version 12.2
[2024:08:07:10:13:38]: Debug: [h264_nvenc @ 0x71f7187a6a80] Nvenc initialized successfully
[2024:08:07:10:13:38]: Debug: Frame 1: IDR Keyframe (AV_FRAME_FLAG_KEY)
[2024:08:07:10:13:38]: Debug: [h264_nvenc @ 0x71f7187a6a80] Nvenc unloaded
[2024:08:07:10:13:38]: Info: SDR color coding [Rec. 601]
[2024:08:07:10:13:38]: Info: Color depth: 8-bit
[2024:08:07:10:13:38]: Info: Color range: [JPEG]
[2024:08:07:10:13:38]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Loaded Nvenc version 12.2
[2024:08:07:10:13:38]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Nvenc initialized successfully
[2024:08:07:10:13:38]: Debug: Frame 1: IDR Keyframe (AV_FRAME_FLAG_KEY)
[2024:08:07:10:13:38]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Nvenc unloaded
[2024:08:07:10:13:38]: Error: Couldn't destroy session handle: The context is not bound
[2024:08:07:10:13:38]: Info: Screencasting with NvFBC
[2024:08:07:10:13:38]: Info: SDR color coding [Rec. 709]
[2024:08:07:10:13:38]: Info: Color depth: 10-bit
[2024:08:07:10:13:38]: Info: Color range: [JPEG]
[2024:08:07:10:13:38]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Loaded Nvenc version 12.2
[2024:08:07:10:13:38]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Nvenc initialized successfully
[2024:08:07:10:13:38]: Error: cuda::cuda_t doesn't support any format other than AV_PIX_FMT_NV12
[2024:08:07:10:13:39]: Debug: [hevc_nvenc @ 0x71f7187a6a80] Nvenc unloaded
[2024:08:07:10:13:39]: Error: Couldn't destroy session handle: The context is not bound
[2024:08:07:10:13:39]: Info: // Testing for available encoders, this may generate errors. You can safely ignore those errors. //
[2024:08:07:10:13:39]: Info: 
[2024:08:07:10:13:39]: Info: // Ignore any errors mentioned above, they are not relevant. //
[2024:08:07:10:13:39]: Info: 
[2024:08:07:10:13:39]: Debug: ------  h264 ------
[2024:08:07:10:13:39]: Debug: PASSED: supported
[2024:08:07:10:13:39]: Debug: REF_FRAMES_RESTRICT: supported
[2024:08:07:10:13:39]: Debug: CBR: supported
[2024:08:07:10:13:39]: Debug: DYNAMIC_RANGE: unsupported
[2024:08:07:10:13:39]: Debug: VUI_PARAMETERS: supported
[2024:08:07:10:13:39]: Debug: -------------------
[2024:08:07:10:13:39]: Info: Found H.264 encoder: h264_nvenc [nvenc]
[2024:08:07:10:13:39]: Debug: ------  hevc ------
[2024:08:07:10:13:39]: Debug: PASSED: supported
[2024:08:07:10:13:39]: Debug: REF_FRAMES_RESTRICT: supported
[2024:08:07:10:13:39]: Debug: CBR: supported
[2024:08:07:10:13:39]: Debug: DYNAMIC_RANGE: unsupported
[2024:08:07:10:13:39]: Debug: VUI_PARAMETERS: supported
[2024:08:07:10:13:39]: Debug: -------------------
[2024:08:07:10:13:39]: Info: Found HEVC encoder: hevc_nvenc [nvenc]
[2024:08:07:10:13:39]: Info: Executing: [/home/pi/secondDisk/Games/Proton/GE-Proton9-11/proton run Heroes3HD.exe] in ["/home/pi/secondDisk/Games/Proton/protonData/pfx/drive_c/games/Heroes of Might and Magic III"]
[2024:08:07:10:13:39]: Debug: /CN=NVIDIA GameStream Client -- verified
[2024:08:07:10:13:39]: Debug: TUNNEL :: HTTPS
[2024:08:07:10:13:39]: Debug: METHOD :: GET
[2024:08:07:10:13:39]: Debug: DESTINATION :: /serverinfo
[2024:08:07:10:13:39]: Debug: User-Agent -- Mozilla/5.0
[2024:08:07:10:13:39]: Debug: Accept-Language -- en-150,*
[2024:08:07:10:13:39]: Debug: Accept-Encoding -- gzip, deflate
[2024:08:07:10:13:39]: Debug: Connection -- Keep-Alive
[2024:08:07:10:13:39]: Debug: Host -- 192.168.0.107:47984
[2024:08:07:10:13:39]: Debug:  [--] 
[2024:08:07:10:13:39]: Debug: uuid -- d261f51697e9458d843435cc4a1c9e22
[2024:08:07:10:13:39]: Debug: uniqueid -- 0123456789ABCDEF
[2024:08:07:10:13:39]: Debug:  [--] 
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 104 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [1]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: OPTIONS
[2024:08:07:10:13:39]: Debug: target :: rtspenc://192.168.0.107:48010
[2024:08:07:10:13:39]: Debug: CSeq :: 1
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
OPTIONS
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 1

---End Response---
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 180 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [2]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: DESCRIBE
[2024:08:07:10:13:39]: Debug: target :: rtspenc://192.168.0.107:48010
[2024:08:07:10:13:39]: Debug: CSeq :: 2
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Accept :: application/sdp
[2024:08:07:10:13:39]: Debug: If-Modified-Since :: Thu, 01 Jan 1970 00:00:00 GMT
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
DESCRIBE
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 2

a=x-ss-general.featureFlags:1
a=x-ss-general.encryptionSupported:5
a=x-ss-general.encryptionRequested:1
sprop-parameter-sets=AAAAAU
a=fmtp:97 surround-params=21101
a=fmtp:97 surround-params=21101
a=fmtp:97 surround-params=642012453
a=fmtp:97 surround-params=660012345
a=fmtp:97 surround-params=85301245367
a=fmtp:97 surround-params=88001234567

---End Response---
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 189 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [3]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: SETUP
[2024:08:07:10:13:39]: Debug: target :: streamid=audio/0/0
[2024:08:07:10:13:39]: Debug: CSeq :: 3
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Transport :: unicast;X-GS-ClientPort=50000-50001
[2024:08:07:10:13:39]: Debug: If-Modified-Since :: Thu, 01 Jan 1970 00:00:00 GMT
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
SETUP
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 3
Session: DEADBEEFCAFE;timeout = 90
Transport: server_port=48000
X-SS-Ping-Payload: A89AACB5A4D63009

---End Response---
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 212 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [4]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: SETUP
[2024:08:07:10:13:39]: Debug: target :: streamid=video/0/0
[2024:08:07:10:13:39]: Debug: CSeq :: 4
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Session :: DEADBEEFCAFE
[2024:08:07:10:13:39]: Debug: Transport :: unicast;X-GS-ClientPort=50000-50001
[2024:08:07:10:13:39]: Debug: If-Modified-Since :: Thu, 01 Jan 1970 00:00:00 GMT
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
SETUP
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 4
Session: DEADBEEFCAFE;timeout = 90
Transport: server_port=47998
X-SS-Ping-Payload: A89AACB5A4D63009

---End Response---
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 215 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [5]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: SETUP
[2024:08:07:10:13:39]: Debug: target :: streamid=control/13/0
[2024:08:07:10:13:39]: Debug: CSeq :: 5
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Session :: DEADBEEFCAFE
[2024:08:07:10:13:39]: Debug: Transport :: unicast;X-GS-ClientPort=50000-50001
[2024:08:07:10:13:39]: Debug: If-Modified-Since :: Thu, 01 Jan 1970 00:00:00 GMT
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
SETUP
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 5
Session: DEADBEEFCAFE;timeout = 90
Transport: server_port=47999
X-SS-Connect-Data: 360019478

---End Response---
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 1632 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [6]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: v=0
o=android 0 14 IN IPv4 192.168.0.107
s=NVIDIA Streaming Client
a=x-ml-general.featureFlags:3 
a=x-ss-general.encryptionEnabled:5 
a=x-nv-video[0].clientViewportWd:3240 
a=x-nv-video[0].clientViewportHt:2160 
a=x-nv-video[0].maxFPS:60 
a=x-nv-video[0].packetSize:1392 
a=x-nv-video[0].rateControlMode:4 
a=x-nv-video[0].timeoutLengthMs:7000 
a=x-nv-video[0].framesWithInvalidRefThreshold:0 
a=x-nv-video[0].initialBitrateKbps:55200 
a=x-nv-video[0].initialPeakBitrateKbps:55200 
a=x-nv-vqos[0].bw.minimumBitrateKbps:55200 
a=x-nv-vqos[0].bw.maximumBitrateKbps:55200 
a=x-ml-video.configuredBitrateKbps:69000 
a=x-nv-vqos[0].fec.enable:1 
a=x-nv-vqos[0].videoQualityScoreUpdateTime:5000 
a=x-nv-vqos[0].qosTrafficType:5 
a=x-nv-aqos.qosTrafficType:4 
a=x-nv-general.featureFlags:167 
a=x-nv-general.useReliableUdp:13 
a=x-nv-vqos[0].fec.minRequiredFecPackets:2 
a=x-nv-vqos[0].bllFec.enable:0 
a=x-nv-vqos[0].drc.enable:0 
a=x-nv-general.enableRecoveryMode:0 
a=x-nv-video[0].videoEncoderSlicesPerFrame:1 
a=x-nv-clientSupportHevc:1 
a=x-nv-vqos[0].bitStreamFormat:1 
a=x-nv-video[0].dynamicRangeMode:0 
a=x-nv-video[0].maxNumReferenceFrames:0 
a=x-nv-video[0].clientRefreshRateX100:0 
a=x-nv-audio.surround.numChannels:2 
a=x-nv-audio.surround.channelMask:3 
a=x-nv-audio.surround.enable:0 
a=x-nv-audio.surround.AudioQuality:0 
a=x-nv-aqos.packetDuration:5 
a=x-nv-video[0].encoderCscMode:0 
t=0 0
m=video 47998  
[2024:08:07:10:13:39]: Debug: command :: ANNOUNCE
[2024:08:07:10:13:39]: Debug: target :: streamid=control/13/0
[2024:08:07:10:13:39]: Debug: CSeq :: 6
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Session :: DEADBEEFCAFE
[2024:08:07:10:13:39]: Debug: Content-type :: application/sdp
[2024:08:07:10:13:39]: Debug: Content-length :: 1459
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
ANNOUNCE
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: Found Host: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Client configured bitrate is 69000 Kbps
[2024:08:07:10:13:39]: Debug: Final adjusted video encoding bitrate is 54188 Kbps
[2024:08:07:10:13:39]: Debug: Expecting incoming session connections from 192.168.0.215
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 6

---End Response---
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: handle_read_encrypted_header(): Handle read of size: 24 bytes
[2024:08:07:10:13:39]: Debug: handle_read_encrypted(): Handle read of size: 96 bytes
[2024:08:07:10:13:39]: Debug: type [REQUEST]
[2024:08:07:10:13:39]: Debug: sequence number [7]
[2024:08:07:10:13:39]: Debug: protocol :: RTSP/1.0
[2024:08:07:10:13:39]: Debug: payload :: 
[2024:08:07:10:13:39]: Debug: command :: PLAY
[2024:08:07:10:13:39]: Debug: target :: /
[2024:08:07:10:13:39]: Debug: CSeq :: 7
[2024:08:07:10:13:39]: Debug: X-GS-ClientVersion :: 14
[2024:08:07:10:13:39]: Debug: Host :: 192.168.0.107
[2024:08:07:10:13:39]: Debug: Session :: DEADBEEFCAFE
[2024:08:07:10:13:39]: Debug: ---Begin MessageBuffer---
PLAY
---End MessageBuffer---
[2024:08:07:10:13:39]: Debug: ---Begin Response---
RTSP/1.0 200 OK
CSeq: 7

---End Response---
[2024:08:07:10:13:39]: Debug: Initialized new control stream session by connect data match [v2]
[2024:08:07:10:13:39]: Debug: Control local address [192.168.0.107]
[2024:08:07:10:13:39]: Debug: Control peer address [192.168.0.215:37079]
[2024:08:07:10:13:39]: Info: CLIENT CONNECTED
[2024:08:07:10:13:39]: Debug: type [IDX_REQUEST_IDR_FRAME]
[2024:08:07:10:13:39]: Debug: type [IDX_START_B]
[2024:08:07:10:13:39]: Debug: RAISE: 192.168.0.215:55185 :: VIDEO
[2024:08:07:10:13:39]: Debug: Received ping [v2] from 192.168.0.215:55185 [0100000039303033364434413542434141393841]
[2024:08:07:10:13:39]: Debug: Start capturing Video
[2024:08:07:10:13:39]: Error: Failed to create session: 
[2024:08:07:10:13:39]: Info: Screencasting with NvFBC
[2024:08:07:10:13:39]: Error: Failed to create session: 
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: RAISE: 192.168.0.215:55183 :: AUDIO
[2024:08:07:10:13:39]: Debug: Received ping [v2] from 192.168.0.215:55183 [0200000039303033364434413542434141393841]
[2024:08:07:10:13:39]: Debug: Start capturing Audio
[2024:08:07:10:13:39]: Debug: Connecting to pulseaudio
[2024:08:07:10:13:39]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-stereo format=float channels=2 channel_map=front-left,front-right sink_properties=device.description=sink-sunshine-stereo
[2024:08:07:10:13:39]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-surround51 format=float channels=6 channel_map=front-left,front-right,front-center,lfe,rear-left,rear-right sink_properties=device.description=sink-sunshine-surround51
[2024:08:07:10:13:39]: Debug: null-sink args: rate=48000 sink_name=sink-sunshine-surround71 format=float channels=8 channel_map=front-left,front-right,front-center,lfe,rear-left,rear-right,side-left,side-right sink_properties=device.description=sink-sunshine-surround71
[2024:08:07:10:13:39]: Info: Setting default sink to: [sink-sunshine-stereo]
[2024:08:07:10:13:39]: Info: Found default monitor by name: sink-sunshine-stereo.monitor
[2024:08:07:10:13:39]: Info: Opus initialized: 48 kHz, 2 channels, 512 kbps (total), LOWDELAY
[2024:08:07:10:13:39]: Info: Setting default sink to: [auto_null]
[2024:08:07:10:13:39]: Error: Couldn't set default-sink [auto_null]: No such entity
[2024:08:07:10:13:39]: Debug: Pulseadio context terminated
[2024:08:07:10:13:39]: Debug: Waiting for video to end...
[2024:08:07:10:13:39]: Debug: Waiting for audio to end...
[2024:08:07:10:13:39]: Debug: Waiting for control to end...
[2024:08:07:10:13:39]: Debug: Resetting Input...
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: Session ended
[2024:08:07:10:13:39]: Debug: Waiting for main listening thread to end...
[2024:08:07:10:13:39]: Debug: Waiting for main video thread to end...
[2024:08:07:10:13:39]: Debug: Waiting for main audio thread to end...
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: Waiting for main control thread to end...
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: All broadcasting threads ended
[2024:08:07:10:13:39]: Debug: System tray loop
[2024:08:07:10:13:39]: Debug: System tray loop
Sidefix commented 3 months ago

image image

ReenigneArcher commented 3 months ago

Try terminating the sessions on Moonlight, instead of pausing them.

You have this set to 3, so I think that's related.

channels = 3

Sidefix commented 3 months ago

I do not think that setting is related. I've tried with it set to 1 and it still occurs after exactly 4 sessions.

The issue also occurs regardless of whether I'm pausing the session, resuming, starting new etc.

gschintgen commented 3 months ago

There are some errors about not being able to release a context or terminate a session. Can you reproduce the same issue with another capture method (kms or x11) or is it exclusive to NvFBC?

Sidefix commented 3 months ago

@gschintgen I've tried feverishly to reproduce the issue with x11 and was unable to.

It seems like it is indeed strictly related to NvFBC.

How can I contribute to further troubleshooting?

gschintgen commented 3 months ago

This seems to be a duplicate of https://github.com/LizardByte/Sunshine/issues/47

Sidefix commented 3 months ago

I don't see the same errors in my log compared to #47.

I'll however try the persistance fix proposed there and report back.

Sidefix commented 3 months ago

Update with some interesting findings.

Unfortunately enabling persistance via nvidia-smi -pm 1 did not fix the issue for me. I did not also try to set the power profile as I had seen on #47 however.

On the other hand, I did identify a way to not reproduce the issue. Let me explain.

If I use the Channels feature to enable a second client to also join an existing session, the behavior of the reproduction changes significantly. To simplify the explanation as much as possible, if there is at least one single ongoing session, it will not matter how many other sessions are started. For example:

The logs for the failure are identical as above so I won't bother posting new ones. All clients are moonlight, one Windows and one Android.

I don't know if this data helps exclude or reinforce the fact that issue #47 seems related. I would like some input from someone with expertise. Again for emphasis, the error logs I see on the server and the ones from 47 are quite different.

I would be happy to troubleshoot further. Currently my mitigation for this is to kill sunshine after every individual session and restart it programatically.

insanemal commented 3 months ago

I have this exact same issue.

It works for 2-3 reconnects and then just fails. It was working consistently until recently, however before that it had always had this issue.

Also terminating sunshine causes X11 to lock up. This can be resolved with an systemctl restart sddm but that is less than ideal. The X11 lockup is new. I want to say it started with the 555 driver.

Oh I've got a 1080GTX on Arch linux, just for reference.

[2024-08-14 15:00:32.261]: Info: CLIENT CONNECTED
[2024-08-14 15:00:32.264]: Error: Failed to create session: 
[2024-08-14 15:00:32.264]: Info: Screencasting with NvFBC
[2024-08-14 15:00:32.264]: Error: Failed to create session: The context is not bound
[2024-08-14 15:00:32.574]: Info: Setting default sink to: [sink-sunshine-stereo]
[2024-08-14 15:00:32.576]: Info: Found default monitor by name: sink-sunshine-stereo.monitor
[2024-08-14 15:00:32.612]: Info: Opus initialized: 48 kHz, 2 channels, 96 kbps (total), LOWDELAY
[2024-08-14 15:00:32.612]: Info: Setting default sink to: [auto_null]
[2024-08-14 15:00:32.613]: Error: Couldn't set default-sink [auto_null]: No such entity
[2024-08-14 15:00:41.267]: Info: // Testing for available encoders, this may generate errors. You can safely ignore those errors. //
[2024-08-14 15:00:41.267]: Info: Trying encoder [nvenc]
[2024-08-14 15:00:41.267]: Info: Screencasting with NvFBC
[2024-08-14 15:00:41.267]: Error: Failed to create session: 
[2024-08-14 15:00:41.467]: Info: Screencasting with NvFBC
[2024-08-14 15:00:41.467]: Error: Failed to create session: 
[2024-08-14 15:00:41.667]: Info: Encoder [nvenc] failed
[2024-08-14 15:00:41.668]: Info: Trying encoder [vaapi]
[2024-08-14 15:00:42.068]: Info: Encoder [vaapi] failed
[2024-08-14 15:00:42.068]: Info: Trying encoder [software]
[2024-08-14 15:00:42.468]: Info: Encoder [software] failed
[2024-08-14 15:00:42.468]: Fatal: Unable to find display or encoder during startup.
[2024-08-14 15:00:42.468]: Fatal: Please check that a display is connected and powered on.
[2024-08-14 15:03:09.497]: Info: Opening UI from system tray
[2024-08-14 15:03:09.509]: Info: Opened url [https://localhost:47990]
[2024-08-14 15:03:09.757]: Info: Web UI: [127.0.0.1] -- not authorized

Prior to this it functions as expected.

Sidefix commented 2 months ago

@insanemal it seems like the symptoms you are encountering are similar, but I don't know if it's the same issue.

I have had no Xorg server issues as a consequence of killing sunshine when my issue occurs. Everything I've described has occurred for a long time, not as a consequence of the newest nvidia drivers.

My issue occurs invariably at the 4th session started / resumed (with the caveat of separate clients I had mentioned in an earlier comment).

Finally it looks like your setup uses automatic detection of encoding and they ALL fail. If your issue were like mine, the issue would only occur for NvFBC which is Nvidia exclusive. I cannot reproduce the issue with other capture methods, so if your setup tried to fall back to vaapi or software, NvFBC is automatically out of the picture.

[[Quick Edit for the above statement]] I may be wrong in regards to the NvFBC exclusivity for capturing data, since it would just be used to send the data to the encoder. But my overall conclusion that our issues are likely separate stands as far as I'm concerned. Try with a different specific capture method and see if you can still reproduce; if yes, definite different issue.

I am almost convinced your issue is something completely different and you may want to open a separate issue for it, or search through older issues for additional help. I suspect maybe it's an Xorg server issue altogether based on your logs and description, but don't take my word for it, my expertise is limited with Xorg.

Good luck!

insanemal commented 2 months ago

@Sidefix

NvFBC works fine until it suddenly doesn't.

That's just the snippet from after it stops working.

It used to always do this and restarting Sunshine was the workaround. In the newest version of sunshine/nvidia driver (both were changed at the same time) it now causes X11 to freeze on restart.

The previous version of both was stable for a while. This was a surprising change as it has been freezing after a session for over a year. More like 2-3 years at this point.

I don't believe it's an X11 issue. I think it relates to the whole not destroying the previous NvFBC context handles correctly.

One of the things I've noticed is, after it does the initial probe and while it's not streaming nvidia-smi lists the application as using GPU Compute resources. This is different behavior to my AMD machine. It shows the GPU usage when it does the probe and while streaming, but not while idle.

[2024-08-14 15:23:28.839]: Info: System tray created
[2024-08-14 15:23:29.105]: Info: Found [1] outputs
[2024-08-14 15:23:29.105]: Info: Virtual Desktop: 1920x1080
[2024-08-14 15:23:29.105]: Info: XrandR: available
[2024-08-14 15:23:29.105]: Info: -- Output --
[2024-08-14 15:23:29.105]: Info:   Resolution: 1920x1080
[2024-08-14 15:23:29.105]: Info:   Offset: 0x0
[2024-08-14 15:23:29.142]: Warning: Gamepad ds5 is disabled due to Permission denied
[2024-08-14 15:23:29.142]: Info: // Testing for available encoders, this may generate errors. You can safely ignore those errors. //
[2024-08-14 15:23:29.142]: Info: Trying encoder [nvenc]
[2024-08-14 15:23:29.142]: Info: Screencasting with NvFBC
[2024-08-14 15:23:29.274]: Info: Creating encoder [h264_nvenc]
[2024-08-14 15:23:29.274]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:29.274]: Info: Color depth: 8-bit
[2024-08-14 15:23:29.274]: Info: Color range: JPEG
[2024-08-14 15:23:29.429]: Warning: [h264_nvenc @ 0x5e0875f0d880] Multiple reference frames are not supported by the device
[2024-08-14 15:23:29.429]: Error: [h264_nvenc @ 0x5e0875f0d880] Provided device doesn't support required NVENC features
[2024-08-14 15:23:29.740]: Error: Could not open codec [h264_nvenc]: Function not implemented
[2024-08-14 15:23:29.740]: Info: Creating encoder [h264_nvenc]
[2024-08-14 15:23:29.740]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:29.740]: Info: Color depth: 8-bit
[2024-08-14 15:23:29.740]: Info: Color range: JPEG
[2024-08-14 15:23:30.055]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:23:30.055]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:30.055]: Info: Color depth: 8-bit
[2024-08-14 15:23:30.055]: Info: Color range: JPEG
[2024-08-14 15:23:30.062]: Warning: [hevc_nvenc @ 0x5e0875f0d880] Multiple reference frames are not supported by the device
[2024-08-14 15:23:30.062]: Error: [hevc_nvenc @ 0x5e0875f0d880] Provided device doesn't support required NVENC features
[2024-08-14 15:23:30.312]: Error: Could not open codec [hevc_nvenc]: Function not implemented
[2024-08-14 15:23:30.312]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:23:30.312]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:30.312]: Info: Color depth: 8-bit
[2024-08-14 15:23:30.312]: Info: Color range: JPEG
[2024-08-14 15:23:30.572]: Info: Creating encoder [av1_nvenc]
[2024-08-14 15:23:30.572]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:30.572]: Info: Color depth: 8-bit
[2024-08-14 15:23:30.572]: Info: Color range: JPEG
[2024-08-14 15:23:30.579]: Warning: [av1_nvenc @ 0x5e0875f0d880] Codec not supported
[2024-08-14 15:23:30.580]: Error: [av1_nvenc @ 0x5e0875f0d880] Provided device doesn't support required NVENC features
[2024-08-14 15:23:30.831]: Error: Could not open codec [av1_nvenc]: Function not implemented
[2024-08-14 15:23:30.831]: Info: Creating encoder [av1_nvenc]
[2024-08-14 15:23:30.831]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:23:30.831]: Info: Color depth: 8-bit
[2024-08-14 15:23:30.831]: Info: Color range: JPEG
[2024-08-14 15:23:30.838]: Warning: [av1_nvenc @ 0x5e0875f0d880] Codec not supported
[2024-08-14 15:23:30.838]: Error: [av1_nvenc @ 0x5e0875f0d880] Provided device doesn't support required NVENC features
[2024-08-14 15:23:31.089]: Error: Could not open codec [av1_nvenc]: Function not implemented
[2024-08-14 15:23:31.089]: Error: Couldn't destroy session handle: The context is not bound
[2024-08-14 15:23:31.089]: Info: Screencasting with NvFBC
[2024-08-14 15:23:31.165]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:23:31.165]: Info: Color coding: SDR (Rec. 709)
[2024-08-14 15:23:31.165]: Info: Color depth: 10-bit
[2024-08-14 15:23:31.165]: Info: Color range: JPEG
[2024-08-14 15:23:31.177]: Error: cuda::cuda_t doesn't support any format other than AV_PIX_FMT_NV12
[2024-08-14 15:23:31.419]: Error: Couldn't destroy session handle: The context is not bound
[2024-08-14 15:23:31.419]: Info:
[2024-08-14 15:23:31.419]: Info: // Ignore any errors mentioned above, they are not relevant. //
[2024-08-14 15:23:31.419]: Info: 
[2024-08-14 15:23:31.419]: Info: Found H.264 encoder: h264_nvenc [nvenc]
[2024-08-14 15:23:31.419]: Info: Found HEVC encoder: hevc_nvenc [nvenc]
[2024-08-14 15:23:31.428]: Error: Failed to create client: Daemon not running
[2024-08-14 15:23:31.435]: Info: Configuration UI available at [https://localhost:47990]
[2024-08-14 15:27:18.013]: Info: // Testing for available encoders, this may generate errors. You can safely ignore those errors. //
[2024-08-14 15:27:18.013]: Info: Trying encoder [nvenc]
[2024-08-14 15:27:18.013]: Info: Screencasting with NvFBC
[2024-08-14 15:27:18.097]: Info: Creating encoder [h264_nvenc]
[2024-08-14 15:27:18.097]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:18.097]: Info: Color depth: 8-bit
[2024-08-14 15:27:18.097]: Info: Color range: JPEG
[2024-08-14 15:27:18.101]: Warning: [h264_nvenc @ 0x7a99087d9640] Multiple reference frames are not supported by the device
[2024-08-14 15:27:18.101]: Error: [h264_nvenc @ 0x7a99087d9640] Provided device doesn't support required NVENC features
[2024-08-14 15:27:18.353]: Error: Could not open codec [h264_nvenc]: Function not implemented
[2024-08-14 15:27:18.353]: Info: Creating encoder [h264_nvenc]
[2024-08-14 15:27:18.353]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:18.353]: Info: Color depth: 8-bit
[2024-08-14 15:27:18.353]: Info: Color range: JPEG
[2024-08-14 15:27:18.617]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:27:18.617]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:18.617]: Info: Color depth: 8-bit
[2024-08-14 15:27:18.617]: Info: Color range: JPEG
[2024-08-14 15:27:18.624]: Warning: [hevc_nvenc @ 0x7a99087d9640] Multiple reference frames are not supported by the device
[2024-08-14 15:27:18.625]: Error: [hevc_nvenc @ 0x7a99087d9640] Provided device doesn't support required NVENC features
[2024-08-14 15:27:18.876]: Error: Could not open codec [hevc_nvenc]: Function not implemented
[2024-08-14 15:27:18.876]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:27:18.876]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:18.876]: Info: Color depth: 8-bit
[2024-08-14 15:27:18.876]: Info: Color range: JPEG
[2024-08-14 15:27:19.141]: Info: Creating encoder [av1_nvenc]
[2024-08-14 15:27:19.141]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:19.141]: Info: Color depth: 8-bit
[2024-08-14 15:27:19.141]: Info: Color range: JPEG
[2024-08-14 15:27:19.148]: Warning: [av1_nvenc @ 0x7a99087d9640] Codec not supported
[2024-08-14 15:27:19.149]: Error: [av1_nvenc @ 0x7a99087d9640] Provided device doesn't support required NVENC features
[2024-08-14 15:27:19.400]: Error: Could not open codec [av1_nvenc]: Function not implemented
[2024-08-14 15:27:19.400]: Info: Creating encoder [av1_nvenc]
[2024-08-14 15:27:19.400]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:19.400]: Info: Color depth: 8-bit
[2024-08-14 15:27:19.400]: Info: Color range: JPEG
[2024-08-14 15:27:19.408]: Warning: [av1_nvenc @ 0x7a99087d9640] Codec not supported
[2024-08-14 15:27:19.408]: Error: [av1_nvenc @ 0x7a99087d9640] Provided device doesn't support required NVENC features
[2024-08-14 15:27:19.659]: Error: Could not open codec [av1_nvenc]: Function not implemented
[2024-08-14 15:27:19.659]: Error: Couldn't destroy session handle: The context is not bound
[2024-08-14 15:27:19.659]: Info: Screencasting with NvFBC
[2024-08-14 15:27:19.766]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:27:19.766]: Info: Color coding: SDR (Rec. 709)
[2024-08-14 15:27:19.766]: Info: Color depth: 10-bit
[2024-08-14 15:27:19.766]: Info: Color range: JPEG
[2024-08-14 15:27:19.779]: Error: cuda::cuda_t doesn't support any format other than AV_PIX_FMT_NV12
[2024-08-14 15:27:20.025]: Error: Couldn't destroy session handle: The context is not bound
[2024-08-14 15:27:20.025]: Info: 
[2024-08-14 15:27:20.025]: Info: // Ignore any errors mentioned above, they are not relevant. //
[2024-08-14 15:27:20.025]: Info: 
[2024-08-14 15:27:20.025]: Info: Found H.264 encoder: h264_nvenc [nvenc]
[2024-08-14 15:27:20.025]: Info: Found HEVC encoder: hevc_nvenc [nvenc]
[2024-08-14 15:27:20.025]: Info: Executing [Desktop]
[2024-08-14 15:27:20.085]: Info: CLIENT CONNECTED
[2024-08-14 15:27:20.283]: Info: Gamepad 0 will be Xbox One controller (auto-selected by client-reported type)
[2024-08-14 15:27:20.542]: Info: Setting default sink to: [sink-sunshine-stereo]
[2024-08-14 15:27:20.548]: Info: Found default monitor by name: sink-sunshine-stereo.monitor
[2024-08-14 15:27:20.587]: Info: Opus initialized: 48 kHz, 2 channels, 512 kbps (total), LOWDELAY
[2024-08-14 15:27:20.689]: Info: Found [1] outputs
[2024-08-14 15:27:20.689]: Info: Virtual Desktop: 1920x1080
[2024-08-14 15:27:20.689]: Info: XrandR: available
[2024-08-14 15:27:20.689]: Info: -- Output --
[2024-08-14 15:27:20.689]: Info:   Resolution: 1920x1080
[2024-08-14 15:27:20.689]: Info:   Offset: 0x0
[2024-08-14 15:27:20.705]: Info: Screencasting with NvFBC
[2024-08-14 15:27:20.773]: Info: Creating encoder [hevc_nvenc]
[2024-08-14 15:27:20.773]: Info: Color coding: SDR (Rec. 601)
[2024-08-14 15:27:20.773]: Info: Color depth: 8-bit
[2024-08-14 15:27:20.774]: Info: Color range: MPEG

That's a full log of it while working fine.

I have recreated the X11 freeze with some simple code that sets up an NvFBC capture that I sigkill while it's running. It's not even grabbing frames. I have to run and kill it around 3-5 times but doing this will cause X11 to lock up.

So, I'm sorry but I disagree with your assessment. It's definitely related in that the issue is handling of NvFBC cleanup and exhausting some kind of limit/handle count.

Also you've indicated your on 560 driver with a much newer 2000 series card, so it's entirely unsurprising to me that you see slightly different behavior. However we are in the same boat that a reboot is basically required to fix the situation.

Sidefix commented 2 months ago

@insanemal I never mentioned anything about needing a reboot, that is wholly exclusively symptomatic on your side. Whether it's the different gpus or drivers we have, unclear. Of note we also use different linux distros, so there are a ton of differences in our environments.

I emphasized that my issue occurred with older drivers and versions of sunshine as well. This is not a recent issue triggered by any recent driver or sunshine version for me. The behavior on my side has been consistent as far back as me first using sunshine about a year ago. Of note is that at the time I was using a GTX 1050.

I agree that NvFBC may be a connective tissue between the behavior or my side and the behavior on yours, but the logs you provided and the ones I see are too different for me to agree. I maintain that you should file a separate issue.

If we get further developments on this specific issue, I do hope it ends up fixing both of our scenarios.

Quick follow-up edit... if you've managed to reproduce the X11 lockup by using NvFBC in your custom code outside of sunshine, how is your issue related to sunshine? It really sounds like it's related to your environment.

insanemal commented 2 months ago

@Sidefix

So, the code I used to make my test, uses the same method of setting up the NvFBC as is used by sunshine, this is a pretty standard diagnostic method. Isolate where you think the issue is as a reproducer and see if it reproduces the issue. Guess what? It does. I used C not C++ because I didn't need all the extra fluff. But I guess I can literally copy their code and get enough plumbing around it to show 1:1

Second, sorry wrong word. I did not mean of the machine I meant of sunshine. That's on me and my tiredness..

Finally the only differences in our output are due to generational differences in GPU. I can't expect you to be intimately familiar with the output differences across settings and versions of GPU. But that's fine.

Your hostility makes no sense.

EDIT: To clear up one other thing. The difference in behavior in the fail state is that you have yours set to explicitly use NvFBC I have mine set to prefer. So on failure it tries other methods. Which is unrelated to the issue. Seriously my guy.

gschintgen commented 2 months ago

It should be possible to remain friendly, polite, respectful in tone all while disagreeing on technicalities.

I do see some similarities in both reports, so for now it might be best to keep both reports together. This way there are two technically minded reporters digging at possibly related issues.

There is the obvious error message: Error: Couldn't destroy session handle: The context is not bound.

But there is also the yet unmentioned detail that both reports have virtual displays:

Info: Virtual Desktop: 1440x900

It might be interesting to check if there are also issues with a physical display.

I'm also wondering if the contexts are "used up" by one pass of Sunshine's whole detection routine (and the limit is ~4) or if they're used up by each individual encoder check (and the actual limit is higher but used up more quickly). In future tests and in particular for log posting it could be useful to single out the offending configuration: configure NvFBC as capture method in Sunshine, disable unused encodings in Sunshine, request a specific encoding in Moonlight.

(I can't help with fixing the code though.)

insanemal commented 2 months ago

The Virtual Desktop bit is odd to me. I'm Running X11 with only one display attached on my machine. But I can understand that the display resolution and the actual desktop can be vastly different in multi-monitor configurations. So I assume that is where this would come into play.

Also, as part of my testing to see if it was really something in the NvFBC handling that was at fault I tested a rust based Sunshine alternative called Moonshine. It's nowhere as complete as Sunshine (it doesn't properly support non-PS controllers for example) but it does do streaming using NvFBC and h.264 (it doesn't do hvec or AV1)

It does not have the same issue. I did forget to check nvidia-smi while it was running to see if it appears there while idle.

I would ask if @Sidefix was able to test it as well but I am unsure if it builds easily on non-Arch distributions and I believe they use Ubuntu.

Also OBS using the NvFBC plugins doesn't appear to have issues either. However I'm reluctant to use it as a test or reference for anything as it's also not doing the capture->encoder pipeline that Sunshine is. So it's probably not Apples to Apples.

Please let me know if there is anything more I can do, debugging wise to help. Happy to setup kdump and provide kernel dumps or whatever is required to get to the bottom of this issue.

Sidefix commented 2 months ago

@insanemal Thanks for the idea; I'm aware of moonshine but it's unclear whether it supports a headless setup like mine. If it doesn't, I lack the hardware to test. I only have laptops apart from the sunshine machine which is a headless desktop.

I'll give it a shot in the near future and report back.

@gschintgen the log you referenced indeed only occurs when screencasting with NvFBC and does not occur on X11. And in my earlier test of using multiple clients, at least one was maintaining the active session alive, so indeed it seems like something about the session being closed seems to not be handled properly specifically with NvFBC. @insanemal's data helps confirm this.

I'll also have a look at the code for NvFBC sessions and try to figure it out.

Thank you both.

ns6089 commented 2 months ago

The way multi-channel is implemented in sunshine is that all concurrent clients share the same capture surface. If reconnecting secondary clients while the main one is still active doesn't have problems, then NvFBC capture session probably doesn't get destroyed properly. And NVIDIA likely puts some limit on how many NvFBC sessions can be active at the same time.

insanemal commented 2 months ago

I have this issue of I'm running a single stream or multiple streams.

And I do mean by setting the config in Sunshine to 1 or higher values.

Having one stream constantly running prevents any NvFBC teardown so it prevents issues on reconnect as the issue seems to be related to teardown and then restart.

But yes, I think if they aren't correctly torn down correctly then some limit is being hit. It's interesting that quitting Sunshine can allow these "ghost handles" to be freed.

I do believe that the issue I have with X11 freezing is in the NVIDIA drivers somewhere as X11 doesn't notice it's frozen. By that I mean X11 appears in the logs to be happy as a clam as do applications. It's only the video output to screen that is impacted. And restarting Xorg fixes it. VTs continue to work correctly and can be switched to. But switching back to X11 is possible but it's still on the same frame.

From my understanding of how NvFBC works this does feel like it's not properly tearing down the capture as NvFBC does some interesting things with framebuffers and blocking their swap to ensure the whole frame gets captured. An improper teardown could leave that in a stuck condition.

I'm speculating here as this would be deep in NVIDIA code and not easy to debug/confirm

Edit: When sunshine is configured to multiple streams I test with multiple clients. The issue only shows up after all clients have disconnected and the capture is stopped. (not the first time of course. So the capture tear down is where the issue appears to be)

Sidefix commented 2 months ago

NvFBC is detailed quite a bit here:

https://developer.download.nvidia.com/designworks/capture-sdk/docs/7.1/NVIDIA%20Capture%20SDK%20Programming%20Guide.pdf

I had a cursory read, and page 5 has a chart that details when a session should be destroyed. I didn't find the info I was hoping for at the referenced section (2.9) but I did find some at 2.15.2.

I was also not able to find the code references used by Sunshine to begin capture using NvFBC as per this doc so I'm questioning it's usefulness... I posted here, maybe someone else has an idea.

ns6089 commented 2 months ago

Judging from Error: Couldn't destroy session handle: The context is not bound. CUDA contexts may be involved. And sunshine likely does a number of incorrect things there, the corresponding code remained mostly untouched since the original author of sunshine went MIA.

insanemal commented 2 months ago

Yes, I believe that is the case. It also explains the variability with my experience, running Arch I get more changes of CUDA and NVIDIA drivers in a set time period. As well as changes to compilers. The recent switch to GCC 13 might have even been part of the cause of the recent behavior changes.

ns6089 commented 2 months ago
/*!
 * \brief Binds the FBC context to the calling thread.
 *
 * The NvFBC library internally relies on objects that must be bound to a
 * thread.  Such objects are OpenGL contexts and CUDA contexts.
 *
 * This function binds these objects to the calling thread.
 *
 * The FBC context must be bound to the calling thread for most NvFBC entry
 * points, otherwise ::NVFBC_ERR_CONTEXT is returned.
 *
 * If the FBC context is already bound to a different thread,
 * ::NVFBC_ERR_CONTEXT is returned.  The other thread must release the context
 * first by calling the ReleaseContext() entry point.
 *
 * If the FBC context is already bound to the current thread, this function has
 * no effects.
 *
 * \param [in] sessionHandle
 *   FBC session handle.
 * \param [in] pParams
 *   ::NVFBC_DESTROY_CAPTURE_SESSION_PARAMS
 *
 * \return
 *   ::NVFBC_SUCCESS \n
 *   ::NVFBC_ERR_INVALID_HANDLE \n
 *   ::NVFBC_ERR_API_VERSION \n
 *   ::NVFBC_ERR_BAD_REQUEST \n
 *   ::NVFBC_ERR_CONTEXT \n
 *   ::NVFBC_ERR_INTERNAL \n
 *   ::NVFBC_ERR_X
 */
NVFBCSTATUS NVFBCAPI
NvFBCBindContext(const NVFBC_SESSION_HANDLE sessionHandle, NVFBC_BIND_CONTEXT_PARAMS *pParams);

/*!
 * \brief Releases the FBC context from the calling thread.
 *
 * If the FBC context is bound to a different thread, ::NVFBC_ERR_CONTEXT is
 * returned.
 *
 * If the FBC context is already released, this function has no effects.
 *
 * \param [in] sessionHandle
 *   FBC session handle.
 * \param [in] pParams
 *   ::NVFBC_SUCCESS \n
 *   ::NVFBC_ERR_INVALID_HANDLE \n
 *   ::NVFBC_ERR_API_VERSION \n
 *   ::NVFBC_ERR_BAD_REQUEST \n
 *   ::NVFBC_ERR_CONTEXT \n
 *   ::NVFBC_ERR_INTERNAL \n
 *   ::NVFBC_ERR_X
 */
NVFBCSTATUS NVFBCAPI
NvFBCReleaseContext(const NVFBC_SESSION_HANDLE sessionHandle, NVFBC_RELEASE_CONTEXT_PARAMS *pParams);

And sunshine has a helper class

    class ctx_t {
    public:
      ctx_t(NVFBC_SESSION_HANDLE handle) {
        NVFBC_BIND_CONTEXT_PARAMS params { NVFBC_BIND_CONTEXT_PARAMS_VER };

        if (func.nvFBCBindContext(handle, &params)) {
          BOOST_LOG(error) << "Couldn't bind NvFBC context to current thread: " << func.nvFBCGetLastErrorStr(handle);
        }

        this->handle = handle;
      }

      ~ctx_t() {
        NVFBC_RELEASE_CONTEXT_PARAMS params { NVFBC_RELEASE_CONTEXT_PARAMS_VER };
        if (func.nvFBCReleaseContext(handle, &params)) {
          BOOST_LOG(error) << "Couldn't release NvFBC context from current thread: " << func.nvFBCGetLastErrorStr(handle);
        }
      }

      NVFBC_SESSION_HANDLE handle;
    };

But I have no idea if resolving that single error will be enough to fix everything.

insanemal commented 2 months ago

Ok, since I'm the kind of guy who just jumps into things regardless of how well I'm versed in something, I can say a few things after having reviewed the code.

Lots of it is basically just "modified" NVIDIA sample code. So that's good?

However, I have found someone else working on a capture tool that has encountered similar issues. https://forums.developer.nvidia.com/t/nvfbc-spurious-failures-in-nvfbccreatecapturesession/183889

Basically, if their app persists and creates and stops capture sessions, it all falls apart. However if run in isolation, they don't encounter any issues. Which is what we are seeing. So I think were on the right path.

Oh @ns6089 do you have that as a diff/patch file? I'll rebuild with said patch. Or was this just an observation of the code?

insanemal commented 2 months ago

Ok so the error we are seeing is coming from a different place

Error: Couldn't destroy session handle: The context is not bound

The code you have there looks like its from the ctx_t class, where as the error is from the handle_t class.

Could it be the context/handle is getting unwound in the wrong order?

I'll see if I can find where these are used and match up the context/handle ordering with the NVIDIA doco

ns6089 commented 2 months ago

Purely observational, just looked into NVIDIA's NvFBC.h and sunshine's src/platform/linux/cuda.cpp I don't even have a Linux test system ready which I can spin up on short order without rearranging a number of things. Again, purely observation, you probably need to check every instances where handle_t::reset() is getting called, either directly, or indirectly from handle_t::~handle_t(). And make sure context is valid there.

insanemal commented 2 months ago

Specifically that error comes from 'handle_t::stop()'

And the issue isn't the handle at all. I think it's the 'ctx_t' helper.

It's destructor gets called anytime 'ctx_t' moves out of scope, and, if my reading of C++ scopes is not wrong, that happens frequently.

TL;DR I think the code works in-spite of itself.

I think we need to switch it to something more deliberate, or move the scope of the ctx_t object.

It looks like the context needs to be bound after the creation of the handle, but the handle needs to be freed before the context is released.

From my reading of the NVIDIA doco, basically you make some "objects" that get a handle. To use them you need to bind them to thread/context. After that you then need to free the objects and unbind the now empty context.

It looks like the auto destructor of the ctx_t class is going off before the handle can be freed.

Which means the objects associated with said handle are never freed.

I wonder if you can just rebind the context do the free and go on your merry way?

Edit: It is after all the same thread you'd be binding the context to.

Edit 2: No. You can't. Have to debug this the hard way.

insanemal commented 2 months ago

OK I was right (and wrong)

It's freeing the context at weird times.

But IT DOES WORK if you just rebind it before you free it.

It's super awful. The whole lot of code needs fixing.

But this hack of mine actually works around the issue.

insanemal commented 2 months ago

If your feeling adventurous @Sidefix build this and have a crack.

Sidefix commented 2 months ago

it works perfectly now @insanemal ! thank you for your efforts

likewise to @gschintgen and @ns6089