secondlife / viewer

🖥️ Second Life's official client
GNU Lesser General Public License v2.1
212 stars 53 forks source link

[webRTC] audio latency on Mac is higher than desired #3002

Open maestrolinden opened 2 weeks ago

maestrolinden commented 2 weeks ago

Environment

Audio output device: Built-in Speakers Audio recording device: Built-in Microphone

extraFPS 7.1.11.11565212741 (64bit) Release Notes

You are at 128.0, 128.0, 23.0 in webRTC2 located at simhost-07c23d76c83837d55.aditi SLURL: secondlife://Aditi/secondlife/webRTC2/128/128/23 (global coordinates 260224.0, 245376.0, 23.0) WebRTC-Voice 2024-10-29.11582869566 Release Notes

CPU: Apple M1 Pro (2400 MHz) Memory: 16384 MB OS Version: macOS 14.7.0 Darwin 23.6.0 Darwin Kernel Version 23.6.0: Wed Jul 31 20:49:39 PDT 2024; root:xnu-10063.141.1.700.5~1/RELEASE_ARM64_T6000 x86_64 Graphics Card Vendor: Apple Graphics Card: Apple M1 Pro

OpenGL Version: 4.1 Metal - 88.1

Window size: 1168x868 Font Size Adjustment: 96pt UI Scaling: 0.8 Draw distance: 96m Bandwidth: 10000kbit/s LOD factor: 1.125 Render quality: 1 Texture memory: 10922MB Disk cache: Max size 1638.4 MB (100.0% used) HiDPI display mode:

J2C Decoder Version: KDU v7.10.4 Audio Driver Version: OpenAL, version 1.1 ALSOFT 1.23.1 / OpenAL Community / OpenAL Soft: OpenAL Soft Dullahan: 1.14.0.202408091638 CEF: 118.4.1+g3dd6078+chromium-118.0.5993.54 Chromium: 118.0.5993.54 LibVLC Version: 3.0.21 Voice Server Version: Secondlife WebRTC Gateway

Packets Lost: 56/1958 (2.9%) November 01 2024 11:57:26

Description

The one way audio latency for webRTC voice audio on my MacBook Pro M1 is approximately 360ms from my location. In contrast, when I repeat the measurement with the same viewer & server version on Windows 10, end-to-end latency is approximately 200ms, which we consider to be about as good as possible.

The good news is that webRTC audio latency is better than when I perform the same measurement on a spatial Vivox call - I measure 490ms with the same viewer on a Vivox/SLS region.

The higher latency on Mac might be beyond our control, but I believe it can be lower based on my tests with Google Meet calls on the same hardware. I performed a few tests calling myself, and got these results:

While Safari has atrocious audio latency (even with the "Noise cancellation" option disabled), Firefox is pretty quick. Ideally, the SL viewer would be able to shave off ~100ms to get similar results.

My roundtrip ICMP ping time (which has an additive effect ) to the grid is about 33ms.

Reproduction steps

  1. UserA and UserB (who are located in the same physical room): login to the same location in a webRTC region with Mac viewers
    • An ideal location will not have any in-world sounds playing, and will allow the viewers to run at a high frame rate.
  2. UserA and UserB: In your viewer Preferences, disable these settings (which contribute to latency when enabled):
    • Noise Suppression
    • Echo Cancellation
    • Automatic Gain Control
  3. Using an audio editing app such as Audacity, start recording audio in the room, with the microphone placed so that both UserA’s voice and UserB’s output speaker can be recorded.
  4. UserA: press the ‘Speak’ button in the viewer, and emit a short “click” sound (such as a “tongue click”) - Verify that the sound is heard on UserB’s speaker
  5. UserA: release the ‘Speak’ button, and stop the audio recording
  6. View the audio recording in Audacity (or similar software used), and measure the difference in the timestamps of UserA’s spoken ‘click’ sound and the time the sound is recorded on UserB’s speaker. This is the measured one-way audio latency.

Expected results: Ideally, end-to-end audio latency from my location (~30ms ping to the grid) would be at or near 200ms, which is what I see with the viewer on a Windows machine. But latency similar to what Firefox gets (<300ms) would be a noticeable improvement too.

Actual results: End-to-end audio latency on Mac is about 360ms - more than ideal.

maestrolinden commented 2 weeks ago

@AndrewMeadows suggested that the onboard microphone might not be the lowest latency input, as it's likely performing noise cancelation from the speakers located right next to it. @roxanneskelly also has a suggestion to try different sampling rates on devices - webRTC voice is transmitted in 48kHz, so devices running natively.

Taking both of those comments into consideration, I modified my test to use a USB headset mic for recording. The device defaults to a 44.1kHz sampling rate, but can also be configured for 48kHz in the "Audio Midi Setup" app. I ran several tests with the USB mic set to 44.1kHz and 48kHz. The output device (onboard speakers) was already set to 48kHz, so I didn't touch that setting.

With the USB headset mic set to 48kHz, the measured audio latency ranged from 290 - 420ms in a series of 10 tests. With the USB headset mic set to 44.1kHz, the latency ranged from 310 - 420ms. I'm not sure why there was such a wide range of measured latency, but in any case it seems like 48kHz capture is slightly better. This is not a surprise, as voice data is transmitted at 48kHz and it avoids the need to resample.