LizardByte / Sunshine

Self-hosted game stream host for Moonlight.
http://app.lizardbyte.dev/Sunshine/
GNU General Public License v3.0
18.6k stars 904 forks source link

Microstuttering due to wrong capture or encoding rate: 59.94Hz instead of 60Hz #2286

Closed gschintgen closed 5 months ago

gschintgen commented 6 months ago

Is there an existing issue for this?

Is your issue described in the documentation?

Is your issue present in the nightly release?

Describe the Bug

I'm experiencing microstuttering on my moonlight clients. This can be alleviated using the frame pacing option, but I still decided to investigate further. I found that even though both the sunshine host system and the moonlight client system use 60.00Hz refresh rates, the stream is captured at 59.94Hz, i.e. the old NTSC-based framerate.

According to multiple reports online this discrepancy can induce periodic pacing irregularities. For example, here's a discussion on moonlight: https://www.reddit.com/r/cloudygamer/comments/px88e0/moonlight_how_to_reconcile_host_and_client/ while here's a recent discussion about Playstation Portal with the exact same problem (59.94 vs 60): https://www.reddit.com/r/PlaystationPortal/comments/198mohz/stutter_cause_ps5_outputting_5994_fpshz/ The example video linked in the second reddit discussion above nicely illustrates the kind of microstuttering that I'm also observing.

Expected Behavior

Sunshine captures and transmits at 60.00Hz if that's the frequency asked for by moonlight.

Additional Context

My report is based on moonlight-qt 5.0.1 running on a Pi4 without underlying desktop environment. (I can gladly provide more details.) Since I did not trust my TV to properly distinguish between 59.94Hz and 60Hz in its OSD I checked using the tvservice command:

pi@moonlight:~ $ tvservice --help
Usage: tvservice [OPTION]...
(...)
  -s, --status                      Get HDMI status
  -a, --audio                       Get supported audio information
  -d, --dumpedid <filename>         Dump EDID information to file(...)
pi@moonlight:~ $ tvservice -s
state 0xa [HDMI CUSTOM RGB full 16:9], 1920x1080 @ 60.00Hz, progressive

I also checked the TV's edid:

pi@moonlight:~ $ tvservice -d edid
Written 256 bytes to edid
pi@moonlight:~ $ edid-decode edid | grep 1920
    DMT 0x52:  1920x1080   60.000 Hz  16:9    67.500 kHz 148.500 MHz
    VIC  16:  1920x1080   60.000 Hz  16:9    67.500 kHz 148.500 MHz
    VIC  31:  1920x1080   50.000 Hz  16:9    56.250 kHz 148.500 MHz
    VIC   5:  1920x1080i  60.000 Hz  16:9    33.750 kHz  74.250 MHz
    VIC  20:  1920x1080i  50.000 Hz  16:9    28.125 kHz  74.250 MHz
    VIC  32:  1920x1080   24.000 Hz  16:9    27.000 kHz  74.250 MHz
    VIC  33:  1920x1080   25.000 Hz  16:9    28.125 kHz  74.250 MHz
    VIC  34:  1920x1080   30.000 Hz  16:9    33.750 kHz  74.250 MHz
    VIC  64:  1920x1080  100.000 Hz  16:9   112.500 kHz 297.000 MHz
    VIC  63:  1920x1080  120.000 Hz  16:9   135.000 kHz 297.000 MHz
pi@moonlight:~ $ edid-decode edid | grep 59.9
    DMT 0x04:   640x480    59.940 Hz   4:3    31.469 kHz  25.175 MHz
    DMT 0x3a:  1680x1050   59.954 Hz  16:10   65.290 kHz 146.250 MHz
    VIC   3:   720x480    59.940 Hz  16:9    31.469 kHz  27.000 MHz

Apparently the TV doesn't even support 1920x1080@59.94. So the client should indeed be running at 60.00Hz.

On the host (AMD / Wayland / Gnome), I'm using gnome-randr.py to change my resolution and refresh rate on the fly. This tool is working great for Gnome/Wayland. Here is the relevant output:

$ gnome-randr.py --current | grep '\*'
    1920x1080  164.83     143.85     120.00     119.88     100.00      89.93      60.00*     59.94      50.00
 [x1.0+, x2.0]

The asterisk indicates the active resolution.

Yet, here's an excerpt of moonlight's log:

00:08:46 - SDL Info (0): Global video stats
00:08:46 - SDL Info (0): ----------------------------------------------------------
Incoming frame rate from network: 59.93 FPS
Decoding frame rate: 59.93 FPS
Rendering frame rate: 59.90 FPS
Frames dropped by your network connection: 0.00%
Frames dropped due to network jitter: 0.04%
Average network latency: 1 ms (variance: 0 ms)
Average decoding time: 6.21 ms
Average frame queue delay: 0.27 ms
Average rendering time (including monitor V-sync latency): 10.48 ms

As you can see, sunshine seems to transmit at the historic NTSC framerate of 59.94Hz instead of the 60 Hz used by the client, the host and configured in moonlight. If I leave the stream running for longer, it averages at exactly 59.94. This can't be a coincidence, right? In the same logfile I also have these entries. (excerpt of grep 60)

00:00:12 - SDL Info (0): Video stream is 1920x1080x60 (format 0x100)
00:00:12 - SDL Info (0): Found display mode with desktop resolution: 1920x1080x60
00:00:13 - SDL Info (0): Frame pacing: target 60 Hz with 60 FPS stream

If you need more info, please tell me.

Host Operating System

Linux

Operating System Version

Ubuntu 22.04.4

Architecture

64 bit

Sunshine commit or version

0.22.2

Package

Linux - deb

GPU Type

AMD

GPU Model

AMD RX6650XT

GPU Driver/Mesa Version

mesa 23.3.6

Capture Method (Linux Only)

KMS

Config

global_prep_cmd = [{"do":"sh -c \"gnome-randr.py --output DP-1 --mode ${SUNSHINE_CLIENT_WIDTH}x${SUNSHINE_CLIENT_HEIGHT} --rate ${SUNSHINE_CLIENT_FPS}\"","undo":"gnome-randr.py --output DP-1 --mode 2560x1440 --rate 165"},{"do":"","undo":""}]
origin_pin_allowed = lan
address_family = both

Apps

No response

Relevant log output

# no framerate mentioned in the log.

[2024:03:22:19:35:08]: Info: Executing Do Cmd: [sh -c "gnome-randr.py --output DP-1 --mode ${SUNSHINE_CLIENT_WIDTH}x${SUNSHINE_CLIENT_HEIGHT} --rate ${SUNSHINE_CLIENT_FPS}"]
[2024:03:22:19:35:08]: Info: Spawning [steam steam://rungameid/275390] in ["/usr/bin"]
[2024:03:22:19:35:08]: Info: Executing [Desktop]
[2024:03:22:19:35:09]: Info: CLIENT CONNECTED
[2024:03:22:19:35:09]: Info: /dev/dri/card0 -> amdgpu
[2024:03:22:19:35:09]: Info: Found display [wayland-0]
[2024:03:22:19:35:09]: Info: Found interface: wl_output(4) version 3
[2024:03:22:19:35:09]: Info: Found interface: zxdg_output_manager_v1(5) version 3
[2024:03:22:19:35:09]: Info: Resolution: 1920x1080
[2024:03:22:19:35:09]: Info: Offset: 0x0
[2024:03:22:19:35:09]: Info: Logical size: 1920x1080
[2024:03:22:19:35:09]: Info: Name: DP-1
[2024:03:22:19:35:09]: Info: Found monitor: GIGA-BYTE TECHNOLOGY CO., LTD. 32"
[2024:03:22:19:35:09]: Info: -------- Start of KMS monitor list --------
[2024:03:22:19:35:09]: Info: Monitor 0 is DP-1: GIGA-BYTE TECHNOLOGY CO., LTD. 32"
[2024:03:22:19:35:09]: Info: --------- End of KMS monitor list ---------
[2024:03:22:19:35:09]: Info: Screencasting with KMS
[2024:03:22:19:35:09]: Info: /dev/dri/card0 -> amdgpu
[2024:03:22:19:35:09]: Info: Found monitor for DRM screencasting
[2024:03:22:19:35:09]: Info: Found connector ID [91]
[2024:03:22:19:35:09]: Info: Found cursor plane [75]
[2024:03:22:19:35:09]: Info: SDR color coding [Rec. 601]
[2024:03:22:19:35:09]: Info: Color depth: 8-bit
[2024:03:22:19:35:09]: Info: Color range: [MPEG]
[2024:03:22:19:35:09]: Error: [hevc_vaapi @ 0x760ccc234100] No usable encoding entrypoint found for profile VAProfileHEVCMain (17).
[2024:03:22:19:35:09]: Info: Retrying with fallback configuration options for [hevc_vaapi] after error: Function not implemented
[2024:03:22:19:35:09]: Warning: [hevc_vaapi @ 0x760cccaa0600] Driver does not support some wanted packed headers (wanted 0xd, found 0x1).
[2024:03:22:19:35:09]: Info: Setting default sink to: [sink-sunshine-stereo]
[2024:03:22:19:35:09]: Info: Found default monitor by name: sink-sunshine-stereo.monitor
[2024:03:22:19:43:43]: Info: CLIENT DISCONNECTED
gschintgen commented 6 months ago

I also increased sunshine's log verbosity (to debug and even verbose), but the only reference to a framerate that I could fine are lines like this:

Debug: [hevc_vaapi @ 0x5edd05815e00] RC framerate: 60/1 (60.00 fps).

I have also now found #1998 which has some superficial similarities. But that issue is for Windows with nvidia and there are apparently log messages that Sunshine reduces its capture rate on purpose. There are no such messages here and it's a completely different capture and encoding pipeline.

gschintgen commented 6 months ago

Newest moonlight logs:

00:00:16 - SDL Info (0): Received first audio packet after 2300 ms
00:00:16 - SDL Info (0): Initial audio resync period: 500 milliseconds
00:00:17 - SDL Warn (0): Unable to set audio thread to high priority: setpriority() failed
00:15:54 - SDL Info (0): Detected quit gamepad button combo
00:15:54 - SDL Info (0): Quit event received
00:15:54 - SDL Info (0): Global video stats
00:15:54 - SDL Info (0): ----------------------------------------------------------
Incoming frame rate from network: 59.94 FPS
Decoding frame rate: 59.94 FPS
Rendering frame rate: 59.92 FPS
Frames dropped by your network connection: 0.00%
Frames dropped due to network jitter: 0.02%
Average network latency: 1 ms (variance: 0 ms)
Average decoding time: 0.61 ms
Average frame queue delay: 0.09 ms
Average rendering time (including monitor V-sync latency): 2.32 ms

BUT: This is

Yet, sunshine's video stream is still at 59.94 fps instead of the expected 60.00.

gschintgen commented 6 months ago

This time I'm using h264 software encoding on the host, instead of VA-API h265. The resulting stream still has that (almost) 59.94 fps framerate:

00:00:16 - SDL Info (0): Dropping window event during flush: 6 (1920 1080)
00:00:16 - SDL Info (0): IDR frame request sent
00:00:18 - SDL Info (0): Received first audio packet after 2300 ms
00:00:18 - SDL Info (0): Initial audio resync period: 500 milliseconds
00:00:18 - SDL Warn (0): Unable to set audio thread to high priority: setpriority() failed
00:11:14 - SDL Info (0): Audio packet queue overflow
00:11:14 - SDL Info (0): Network dropped audio data (expected 85, but received 114)
00:27:28 - SDL Info (0): Audio packet queue overflow
00:27:28 - SDL Info (0): Network dropped audio data (expected 63759, but received 63788)
00:31:13 - SDL Info (0): Detected quit gamepad button combo
00:31:13 - SDL Info (0): Quit event received
00:31:13 - SDL Info (0): Global video stats
00:31:13 - SDL Info (0): ----------------------------------------------------------
Incoming frame rate from network: 59.93 FPS
Decoding frame rate: 59.93 FPS
Rendering frame rate: 59.90 FPS
Frames dropped by your network connection: 0.00%
Frames dropped due to network jitter: 0.03%
Average network latency: 1 ms (variance: 0 ms)
Average decoding time: 0.53 ms
Average frame queue delay: 0.10 ms
Average rendering time (including monitor V-sync latency): 2.54 ms
00:31:13 - SDL Info (0): Stopping input stream...

I really don't get it. If my host's display were configured to refresh at 60Hz, I'd suspect that it's lying and that somehow the true refresh rate is 59.94Hz and due to some unintended syncing while capturing the stream ends up at 59.94Hz instead of a clean 60. But as in the previous comment the screen currently stays at 165Hz while streaming.

Is it possible that some games or emulators output their frames at 59.94Hz and some unintended(?) locking/synchronizing is happening when sunshine captures the frames? Or is sunshine maybe misconfiguring the encoders somehow, so that they're expecting 59.94 fps but sunshine is giving them 60?

gschintgen commented 6 months ago

In order to rule out an unfortunate choice of streamed games/emulators, I now tested a simple stream of my desktop showing glxgears. The host screen has been manually set to 1920x1080@60Hz. On the host glxgear runs as perfect ass possible at 60fps. (It's mostly showing 59.999, 60.000 or 60.001 fps). On the receiving end in moonlight I get these stats after some 10 minutes:

00:11:25 - SDL Info (0): Global video stats
00:11:25 - SDL Info (0): ----------------------------------------------------------
Incoming frame rate from network: 59.95 FPS
Decoding frame rate: 59.95 FPS
Rendering frame rate: 59.93 FPS
Frames dropped by your network connection: 0.00%
Frames dropped due to network jitter: 0.01%
Average network latency: 1 ms (variance: 0 ms)
Average decoding time: 0.61 ms
Average frame queue delay: 0.08 ms
Average rendering time (including monitor V-sync latency): 2.26 ms

(This time the host config was: X11, KMS capture, AMD VA-API h265 encoding)

The difference in framerates between, let's say, 59.94Hz and 60Hz is such that a difference of one whole frame builds up after each period of 17 seconds or approximately 4 times per minute.

Nonary commented 6 months ago

Here are the best practices in regards to getting a smooth stream:

  1. Disable VSYNC in the game, although some games may actually perform better with VSYNC enabled. The vast majority of games will not benefit from vsync while streaming though.

  2. Unlock framerate of video game. If possible, unlock the frame rate of your game by configuring it for unlimited frames. A lot of games have poor frame limiters that have bad frame pacing which may not be noticeable when playing on a GSYNC/FreeSync display but these technologies do not work for Streaming.

  3. Apply a framerate limit that matches your target frame rate on the stream, so if streaming at 60fps, apply a 60 frame rate limit. Use software such as RTSS or the GPU drivers. This is preferred over using game frame rate limiters as it is more consistent with making sure the framepacing stays as perfect as possible.

  4. Configure PC for High Performance Mode. Balanced mode is known to cause microstuttering, like I mentioned before GSYNC/FreeSync does not benefit streaming so you wouldn't notice these microstutters locally.

  5. Watch GPU Utilization and keep it under 95%. Again, this has a lot to do with the fact that streaming does not support GSYNC/FreeSync. If your GPU usage is this high, it is likely going to drop frames and cause performance problems while streaming.

  6. Turn on frame pacing and VSYNC options on Moonlight client if applicable.

Before determining that there are issues with your stream, make sure to double check the game is not dropping frames itself before trying to troubleshoot. It is suggested to use RTSS or any other benchmarking software (such as MANGOHUD for LINUX) to display frametimes so it is obvious if the frame drops are happening on the client side instead of host side.

Closing this issue as it is not a Sunshine bug or issue

gschintgen commented 6 months ago

Thanks for posting the suggestions.

Though I'm still not convinced that there is not a capture/encoding issue lurking somewhere:

That's quite fishy in my opinion.

But I do understand that the initial report may have come across as too unspecific or smells of user error (which I can't exclude entirely). I won't insist unless I have something major to add.

PS the smooth stream suggestion could make a valuable addition to the official documentation.

peperunas commented 6 months ago

I guess I am having - if not the same - a very similar issue hence I would keep this bug open for further analysis. I posted my issue both on Reddit and the Discord server but never opened a report here on Github:

Original issue

I've been unable to achieve a smooth video stream, which contrasts sharply with the seamless experience provided by Steam Link.

Issue Summary:

Troubleshooting Steps Taken:

Technical Details:

Configuration:

capture=nvfbc 
encoder=nvenc 
wan_encryption_mode=0 
lan_encryption_mode=0 
origin_web_ui_allowed=lan channels=2 
min_threads=8 
nvenc_preset=7 
nvenc_twopass=full_res 
nvenc_vbv_increase=400 
fec_percentage=150

Given the steps I've already taken and the configuration details provided, I'm at a loss for what to try next. Additionally, the Reddit thread shows other users in the same boat.

gschintgen commented 6 months ago

I did some more testing, but I'll spare you the details. (tcpdump, compute timestamp differences of image bursts, do some basic statistics). Not only is the average frame time slightly higher than the expected 16.6666 ms (for 60Hz), but also the median is longer. (Streaming of regular desktop stream from a host in an xorg sesssion.)

I inspected the code in kmsgrab.cpp, in particular this part here: https://github.com/LizardByte/Sunshine/blob/de97eacd309ef6a7b9be776df2a7c69726d97ec3/src/platform/linux/kmsgrab.cpp#L1411-L1425

I'm definitely not an expert on this, but I am wondering how precise that std::this_thread::sleep_for(1ns) is in practice. Because the delay (0.016666666 seconds) is not added to the previous next_frame but to now. So couldn't any minimal overshoot accumulate and lead to a slow but non-zero drift when compared to an exact cadence? A few microseconds here, a few there?

Again, not an expert, but wouldn't it be possible to do something like

next_frame += delay;
if (next_frame < now) { // some major slowdown happened; we couldn't keep up
    next_frame = now + delay;
}
ReenigneArcher commented 6 months ago

@gschintgen I think it's worth trying. If you put up a PR it would be easier for people to test.

gschintgen commented 6 months ago

Ok, will do.

On March 28, 2024 5:43:54 PM GMT+01:00, ReenigneArcher @.***> wrote:

@gschintgen I think it's worth trying. If you put up a PR it would be easier for people to test.

-- Reply to this email directly or view it on GitHub: https://github.com/LizardByte/Sunshine/issues/2286#issuecomment-2025662005 You are receiving this because you were mentioned.

Message ID: @.***>

gschintgen commented 6 months ago

I'm definitely on to something ;-) Here's with the proposed change:

00:15:55 - SDL Info (0): Global video stats
00:15:55 - SDL Info (0): ----------------------------------------------------------
Incoming frame rate from network: 60.01 FPS
Decoding frame rate: 60.01 FPS
Rendering frame rate: 60.00 FPS
Frames dropped by your network connection: 0.00%
Frames dropped due to network jitter: 0.01%
Average network latency: 1 ms (variance: 0 ms)
Average decoding time: 0.61 ms
Average frame queue delay: 0.08 ms
Average rendering time (including monitor V-sync latency): 5.11 ms

The quality of the stream (frame rate, microstutters) improves subjectively and objectively. Given the absolutely perfect 60Hz, I'm convinced that the one slight stutter was actually just host-side and streamed as-is.

The PR will come over the weekend.

(The 59.94 Hz NTSC frequency was just a coincidence after all.)

@peperunas I'm not sure any of this is related to nvidia-related codepaths, but I'm very grateful that you got the issue reopened!

peperunas commented 6 months ago

@gschintgen That's great news! I did not do anything; you are doing the heavy lifting :-)

Please feel free to contact me; I would be glad to test and assist with the development. I did not have time to dig deeper as you did, but now that we have a place to focus on, I'd gladly help however possible.

Great job and thanks @ReenigneArcher for re-opening the issue!

gschintgen commented 6 months ago

I filed PR #2333. @peperunas feel free to check it out and test it. (I'm not sure if, when, where binaries will be available.)

ReenigneArcher commented 6 months ago

I'm not sure if, when, where binaries will be available

Must be logged into GitHub and can grab most Linux artifacts from here. https://github.com/LizardByte/Sunshine/actions/runs/8493409909#artifacts

That will have all, except AppImage and Flatpak.

kstorbakken commented 6 months ago

I recently switched from Windows to Arch and was noticing this stuttering issue every few seconds. I tried a bunch of things listed above I already knew about and it didn't help. I tested the latest build on the related PR to this issue with X11 capture and it fixed all stuttering issues for me. Just thought you might find this feedback helpful.

gschintgen commented 6 months ago

Thank you very much for your feedback! How do you encode the captured frames? In hardware I suppose? AMD/Intel/Nvidia?

On April 2, 2024 10:08:08 AM GMT+02:00, kstorbakken @.***> wrote:

I recently switched from Windows to Arch and was noticing this stuttering issue every few seconds. I tried a bunch of things listed above I already knew about and it didn't help. I tested the latest build on the related PR to this issue with X11 capture and it fixed all stuttering issues for me. Just thought you might find this feedback helpful.

-- Reply to this email directly or view it on GitHub: https://github.com/LizardByte/Sunshine/issues/2286#issuecomment-2031345141 You are receiving this because you were mentioned.

Message ID: @.***>

kstorbakken commented 6 months ago

Ahh shoot, yes, I meant to mention that. I'm on Nvidia and using NVENC. I haven't played around with patching NvFBC yet, but I'll probably test that and report back if nobody else gets to it before me.

peperunas commented 6 months ago

Great job @gschintgen and thanks for your contribution!

I briefly tested your fork and it seems the microstutters are gone / reduced significantly.

Looks great for me!

samkitty commented 6 months ago

Perhaps a stupid question, but does this affect Windows as well? I've always had microstutters on my Windows 11 Pro build using an Nvidia RTX3080. My performance overlay also shows the dreaded 59.94fps despite using a 60hz display client (Logitech G Cloud), but I haven't looked into this enough to understand what causes it.

gschintgen commented 6 months ago

Perhaps a stupid question, but does this affect Windows as well? I've always had microstutters on my Windows 11 Pro build using an Nvidia RTX3080. My performance overlay also shows the dreaded 59.94fps despite using a 60hz display client (Logitech G Cloud), but I haven't looked into this enough to understand what causes it.

I don't think so. The Windows capture loop (or are there multiple different ones?) seems to be this one: https://github.com/LizardByte/Sunshine/blob/bb7c2d50ef50bf5bca34c2b873c9cf74e0745baa/src/platform/windows/display_base.cpp#L159-L186

It is quite a bit more involved than the one in Linux, but it does seem to compute a specific theoretical target time for each frame that is to be captured. This is done specifically in order to not accumulate a sleep overshot, like it still does on Linux. See PR #1288 (which I only found now using git blame). So in essence the present type of bug (accumulating overshoot of waiting times) has been fixed for Windows ca. 8 months ago.

Then again, I'm quite new to this entire codebase and my C++ is rusty to say the least.

In my experience the onscreen latency stats fluctuate a bit too much to properly distinguish e.g. 59.94 from 60.00Hz. The first step would be to ensure that you're not inadvertently synchronizing to 59.94Hz on your host (e.g. vsync to a display at that refresh rate). I also found the moonlight-qt console output (which I've just saved as a logfile) quite helpful since it gives global stats instead of a momentary snapshot. Once you have noteworthy evidence you can file a new issue to properly document under what conditions the stream doesn't achieve the expected framerate. If you are constantly updating the screen at a precise 60fps (via vsync or framecap) and request a 60fps stream, then the stream should indeed have 60.0 fps, as illustrated here. (Caveat: I read multiple times that apparently on Windows frames will only be captured if the actual display content has changed, hence the need to constantly update the screen contents if you want to replicate this testing methodology.)

samkitty commented 6 months ago

So in essence the present type of bug (accumulating overshoot of waiting times) has been fixed for Windows ca. 8 months ago.

Thanks for confirming. I will get some testing done and collect enough evidence to open a new issue if applicable.

ReenigneArcher commented 6 months ago

on Windows frames will only be captured if the actual display content has changed

I think related to https://github.com/LizardByte/Sunshine/pull/754