Bitrate and sampling rate issues in v4l2rtspserver-master

tssva commented 6 years ago

A couple of issues regarding the audio bitrate and sampling rates

v4l2rtspserver-master allows setting the input (AUDIOINBR) and output (AUDIOOUTBR) sampling rate via the rtspserver.conf file. I assume the BR is suppose to stand for bitrate but the sampling rate is frequency of sampling and not the bitrate. It may seem like quibbling to mention it, but it may cause confusion among users as to what exactly they are changing when modifying these values.
Currently the AUDIOOUTBR behaves differently for Opus versus MP3. For MP3 it changes the output sampling rate, but for Opus it changes the output bitrate. I assume this is a bug on the Opus side and the intent was to also change the output sampling rate for Opus and not the bitrate.

I'm also not sure I understand the reasoning for changing the output sampling rate from the input sampling rate. Usually resampling is only done when the output is intended for a device or format which cannot play the input sampling rate because resampling whether it is up or down will result in a loss of quality.

The default configuration has the codec set at MP3, input sampling set at 16000 and output sampling set at 44100. The result is that input audio is being resampled up resulting in some loss of quality, audio delay and the MP3 stream is set at 64kbs CBR, lame's default for 44.1khz mono audio, which is more bandwidth than would be necessary for the default input sampling rate.

My suggestion for default values for MP3 would be to keep the input and output sample rates the same. This can be accomplished by setting AUDIOOUTBR=0. Lame will use the value of the input rate for the output when the output rate is set to 0. I also suggest setting the output bitrate to preset mode V9. This is the lowest quality preset and is generally recommended for mono voice. It will result in less cpu usage, bandwidth usage and audio delay due to processing.

For Opus the situation is worse when input and output don't match. The opus library does not perform resampling. So if you feed 16khz sampled audio and have output set to say 48khz then your audio quality really suffers because opus will convert to 48khz without resampling which really hurts your audio quality. Whether intended or not today v4l2rtspserver doesn't set the output sample rate for opus and it should probably remain that way. I also recommend not setting the output bitrate. Opus will by default determine the best bitrate to use for the given input. I haven't looked closely enough and quite frankly right now it is too late for me to take a look and follow whether the filter noise suppression library is being applied when the opus codec is being used. If it is disabling it may be the better option. I haven't compared yet but opus does have noise suppression filters which are applied when it is determined the input is voice.

These are decisions which probably should be made by the maintainer but I will be happy to provide PRs to implement once a decision has been made regarding how you want to move forward with the audio settings.

nik0 commented 6 years ago

@tssva , Your comments make sense.

For the name of the variable I am OK. I didn't have time to change everywhere
for MP3 I quite agree to let the in and out sample rate the same. This is a configuration possibility. To be honest 8k in and out is more than enough for me. But I don't know what other people needs.
OPUS was a request for someone who is using Janus/WebRtc on Raspberry. He seems to be happy with it. My feeling is that the audio quality was the same as MP3. Your mention about the "non resampling" seems strange to me. And if I well understood the API 48K is mandatory as output.

Feel really free to propose a PR.

tssva commented 6 years ago

Opus supports 5 input sample rates 8k, 12k, 16k, 24k, and 48k. If you set the encoder to a 48k input sample rate and then feed it 44.1k sampled data it will accept it and just map it directly into the 48k space which will cause audio quality issues. If you set the input rate to 44.1k and output to 48k in lame it will resample the input to the 48k rate.

Opus does have specific support for voice which gives it the possibility of delivering better voice legibility than mp3. NIST did some studies of various low bandwidth codecs testing for voice legibility in the presence of background noises which would typically be experienced by first responders. For instance alarms and sirens. Opus tested very well under these conditions. Even if it only produces equivalent sound quality as mp3 it will do so while using much lower bandwidth.

I'm wrapping up a PR to upgrade the version wpa_supplicant because the version shipped with the cameras is vulnerable to KRACK attacks. I'll hopefully get that submitted tomorrow and then after the weekend start looking at getting some PRs submitted regarding the audio.

On Wed, Jun 27, 2018 at 1:46 AM nik0 notifications@github.com wrote:

@tssva https://github.com/tssva , Your comments make sense.

For the name of the variable I am OK. I didn't have time to change everywhere

for MP3 I quite agree to let the in and out sample rate the same. This is a configuration possibility. To be honest 8k in and out is more than enough for me. But I don't know what other people needs.

OPUS was a request for someone who is using Janus/WebRtc on Raspberry. He seems to be happy with it. My feeling is that the audio quality was the same as MP3. Your mention about the "non resampling" seems strange to me. And if I well understood the API 48K is mandatory as output.

Feel really free to propose a PR.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Dafang-Hacks/Main/issues/15#issuecomment-400550034, or mute the thread https://github.com/notifications/unsubscribe-auth/ATmH2R2SioNyca4r6VVRou48J329kn4wks5uAxw5gaJpZM4U48yB .

Dafang-Hacks / Main

Bitrate and sampling rate issues in v4l2rtspserver-master #15