Unity-Technologies / com.unity.webrtc

WebRTC package for Unity
Other
753 stars 191 forks source link

[BUG]: Input latency is way too high on simple projects #801

Closed doctorpangloss closed 2 years ago

doctorpangloss commented 2 years ago

Package version

2.4.0-exp.10

Environment

* OS: Windows 11
* Unity version: 2021.3

Steps To Reproduce

  1. Run a project with NvEncoder streaming.
  2. Observe input latencies of 60-150ms on a well-specced machine locally connected (0ms RTTs).

Current Behavior

The latency is like 60-150ms.

Expected Behavior

The latency should be close to 0ms. This was achievable with 2.4.0-exp5 (before the encoder refactor)

Anything else?

No response

doctorpangloss commented 2 years ago

Maybe no copying will help. The profiler measurements for WebRTC.EncodeFrame with NvEncoder is showing durations as high as 7ms, which is really surprising and suggests it is misconfigured.

7ms is within typical encoding duration - https://parsec.app/blog/nvidia-nvenc-outperforms-amd-vce-on-h-264-encoding-latency-in-parsec-co-op-sessions-713b9e1e048a

doctorpangloss commented 2 years ago

I don't think the copying, which takes like 0.12ms, has anything to do with it.

karasusan commented 2 years ago

@doctorpangloss Thank you for measuring the latency and report them. I guess the latency may be increased since the PR #650. This pull request separates the loads of encoding from the rendering thread. And another PR #728 also affects the latency because it adds the wait time to wait frame timing for keeping framerate.

I guess our implementation of multi-threading for encoder has the problem which makes worse latency. I need to check more detail. One question, I would like to know how you measure the latency.

doctorpangloss commented 2 years ago

One question, I would like to know how you measure the latency.

I put a game object in the scene whose position is set to whatever the mouse's position is, and I move it around. It's very subjective.

doctorpangloss commented 2 years ago

I am experimenting with improvements.

UnityVideoTrackSource.OnFrameCaptured - what does it do? Why is this so slow? I see encoding separately from this preamble. Blitting + "OnFrameCaptured" (which does not include the encoding step) is as long as 7ms on my 5950x + 3090.

karasusan commented 2 years ago

I turned off framerate control.

Have you modify the native code to improve performance? As you know, UnityVideoTrackSource.OnFrameCaptured pass frames to encoder asyncnously to keep the framerate of encoding. It may be possible to improve the latency if we fix the process runs syncnously.

doctorpangloss commented 2 years ago

Have you modify the native code to improve performance?

Only by reverting the framerate control patch.

It may be possible to improve the latency if we fix the process runs syncnously.

I think so. The current way you put the NvEnc encoding step on the encoder queue seems right to me. Enabling NvEnc async for Windows only would achieve something similar. What I am trying to figure out is how to prevent the waiting that occurs in OnFrameCaptured:

Capture custompass

In this screenshot, observe OnFrameCaptured is 2.6ms on the render thread. Right at the end of that scope, Encoder Queue thread starts its NvEnc work. Sometimes OnFrameCaptured is as long as 7.6ms. All the other work it's doing is fast. It appears that the waiting in OnFrameCaptured is happening on the render thread.

Capturex

Okay I see GpuMemoryBufferPool::CreateFrame is very slow. I will try experimental/direct-renderer.

doctorpangloss commented 2 years ago

Why not map the Unity RenderTexture to a CUDA resource directly? Why copy?

Can we retain the SRP cameraColorBuffer somehow?

doctorpangloss commented 2 years ago

I see there need to be three fixes

doctorpangloss commented 2 years ago

I've documented my conclusions here: https://github.com/Unity-Technologies/com.unity.webrtc/issues/803