sipsorcery-org / sipsorcery

A WebRTC, SIP and VoIP library for C# and .NET. Designed for real-time communications apps.
https://sipsorcery-org.github.io/sipsorcery
Other
1.44k stars 438 forks source link

High latency in one direction only when connected to conference call #1099

Open Oudalally opened 5 months ago

Oudalally commented 5 months ago

We have a phone system running on Asterisk, with both a SipSorcery instance and a Cisco IP phone connecting to a conference call.

Audio from the softphone to the handset is sent with a minimal latency, and is working perfectly fine, however audio from the Cisco phone back to the softphone has a high latency. This was initially around 4 seconds, but adding a buffer reduced this to around 1 second. This is an improvement, but still too high.

Connecting 2 Cisco phones to the conference in the same way has no notable latency (there is some, but it's less than 100ms so nothing to be concerned about).

The implementation is using NAudio (via SipSorceryMedia.Windows), and a VoipMediaSession to handle the RTP as shown below. Is there something I've missed here, or is this indicative of a bug?


audioEndpoint = new WindowsAudioEndPoint(encoder, outputDeviceIndex, inputDeviceIndex);

List<AudioFormat> sinkFormats = audioEndpoint.GetAudioSinkFormats();
List<AudioFormat> sourceFormats = audioEndpoint.GetAudioSourceFormats();

AudioFormat sourceAudioFormat = sourceFormats.FirstOrDefault(f => f.FormatName == "PCMA");
audioEndpoint.SetAudioSourceFormat(sourceAudioFormat);

AudioFormat sinkAudioFormat = sinkFormats.FirstOrDefault(f => f.FormatName == "PCMA");      
audioEndpoint.SetAudioSinkFormat(sinkAudioFormat);

MediaEndPoints endpoints = audioEndpoint.ToMediaEndPoints();

voipMediaSession = new VoIPMediaSession(audioEndpoint.ToMediaEndPoints());
voipMediaSession.AcceptRtpFromAny = true;
voipMediaSession.AudioExtrasSource.AudioSamplePeriodMilliseconds = 500;

TimeSpan packetTimeout = new TimeSpan(0, 0, 0, 40);
voipMediaSession.AudioStream.AddBuffer(packetTimeout);
sipsorcery commented 5 months ago

4s of latency sunds crazy high. That makes the audio essentially unusable, participants will be having different converstations. Once latency gets over 1s calls can get awkward.

This library, and it's siblings, focuses on signalling rather than media so I wouldn't claim it will give you an optimal audio experience. I recall a jitter buffer got added to one of the media libraries so the first thing I would do is turn that off. It's likely the only thing that could add 4s of latency. Personally I always perfer to have a few dropped packets instead of a jitter buffer.

Oudalally commented 5 months ago

I'd be very happy with some dropped packets if it could get the latency down.... With a bit of tinkering, I managed to get it down to about 900ms (ish), but it's still a little high.

If I'm using NAudio and make changes to the settings for it before I set up the audio sources, are those settings carried over into SipSorcery (sorry, I'm not quite sure how to phrase the question).

I've been looking for some sample code showing how to make changes to the NAudio settings which are then applied to the sip call, however I'm drawing a blank.

As best as I can work out, if I can avoid the BufferedWaveProvider, this could potentially eliminate the jitter buffer, but I'm unsure how to go about it as yet.