sipsorcery-org / SIPSorceryMedia.FFmpeg

GNU Lesser General Public License v2.1
33 stars 24 forks source link

Using a different codec than PCMU #18

Closed ChristopheI closed 2 years ago

ChristopheI commented 2 years ago

Today, this library is working well with PCMU Audio Codec.

If I change to G722 a crash occur in FFmpegAudioSource.AudioDecoder_OnAudioFrame when _audioEncoder.EncodeAudio() is called. I'm using the sample FFmpegFileAndDevicesTest with AudioCodecsEnum AudioCodec = AudioCodecsEnum.G722;

So I tried to add support to this codec and I succeeded but:

How I modify the code:

  1. _audioEncoder.EncodeAudio() crashs because the clock rate is not the same between PCMU and G722 so I need to resample the stream
  2. For this I need Resample() method from AudioEncoder class but it's not provided by IAudioEncoder ...
  3. So I did this: I added SIPSorcery as new package and change sevearl constructour method to use AudioEncoder instead of IAudioEncoder => Yes it's not good at all but the only way to have access to the Resample() method
  4. Then is use this code (in FFmpegAudioSource.AudioDecoder_OnAudioFrame):

          // FFmpeg AV_SAMPLE_FMT_S16 will store the bytes in the correct endianess for the underlying platform.
          short[] pcm = buffer.Take(dstSampleCount * 2).Where((x, i) => i % 2 == 0).Select((y, i) => BitConverter.ToInt16(buffer, i * 2)).ToArray();
          if (_audioFormatManager.SelectedFormat.ClockRate != Helper.AUDIO_SAMPLING_RATE_PCMU)
              pcm = _audioEncoder.Resample(pcm, Helper.AUDIO_SAMPLING_RATE_PCMU, _audioFormatManager.SelectedFormat.ClockRate);
          var encodedSample = _audioEncoder.EncodeAudio(pcm, _audioFormatManager.SelectedFormat);
    
          OnAudioSourceEncodedSample?.Invoke((uint)encodedSample.Length, encodedSample);

    instead of

          // FFmpeg AV_SAMPLE_FMT_S16 will store the bytes in the correct endianess for the underlying platform.
          short[] pcm = buffer.Take(dstSampleCount * 2).Where((x, i) => i % 2 == 0).Select((y, i) => BitConverter.ToInt16(buffer, i * 2)).ToArray();
          var encodedSample = _audioEncoder.EncodeAudio(pcm, _audioFormatManager.SelectedFormat);
    
          OnAudioSourceEncodedSample?.Invoke((uint)encodedSample.Length, encodedSample);

It's working but I feel very lucky because I'm not sure that this line is correct if another codec is used:

OnAudioSourceEncodedSample?.Invoke((uint)encodedSample.Length, encodedSample);

I also think that the final audio quality is not greater because we asked FFmpeg to change first the format and rate to AVSampleFormat.AV_SAMPLE_FMT_S16 and Helper.AUDIO_SAMPLING_RATE in FFmpegAudioDecoder.InitialiseSource

So do you have any idea to add support of G722 ? and G729 later ?

Thanks

sipsorcery commented 2 years ago

The correct approach would be to improve the IAudioEncoder interface by adding a sample rate parmeter so that it can deal with both 8kHz and 16kHz inputs.

And as you've noticed the audio resampling logic in the sipsorcery library is as crude as it can possibly get. For example to downsample from 16kHz to 8kHz every second sample is dropped.