layeh / gumble

gumble is a Mumble client implementation in Go (golang)
https://pkg.go.dev/mod/layeh.com/gumble
Mozilla Public License 2.0
172 stars 55 forks source link

Getting choppy sound. #44

Open monirz opened 5 years ago

monirz commented 5 years ago

Hello, we are using Gumble for voice chat in murmur server, everything is working fine except the sound quality is so choppy that it's hard to continue the conversation. And that's a big problem. As I saw some issues and discussion on that, I tried using AutoBitrate for adjusting bitrate from the server. The server's bitrate is set to 96 KiB/s. I tried changing different AudioInterval and at 60ms it makes it little better but it's not that good as usable. Any suggestion on that for improving the sound quality? Thank you!

ghost commented 5 years ago

This is likely because gumble currently does not have a jitter buffer. One was worked on (#34), but I haven't had the bandwidth to review.

athasamid commented 5 years ago

hi @monirz how to fixing this issue?

zrdimetc commented 4 years ago

Regarding the “choppy sound” from Gumble, some users were referring to... Audio encoded at lower sample rates from 48 kHz (24, 16, 12, 8), when using longer frame sizes than 10 ms (20, 40, 60...), from any Mumble clients based on Gumble, like talkiepi and talkkonnect does not work well with Plumble - and to a smaller extent, with official Mumble client. Sound from Gumble will be “choppy” and break up to a level where it is practically unusable. When doing the tests between Gumble based clients (talkiepi, talkkonnect) we ruled out any network issues. This issue is not related to any particular platform or lack of computing power either. Communication in the opposite direction, from Plumble/ official Mumble clients to Gumble with any supported audio sampling rates / frame sizes is decoded by Gumble clients without a problem.

What works perfectly between Gumble and Plumble/ official Mumble - is 48 kHz sound with 10 ms frame size. But as soon as the longer time intervals are used in Gumble for encoding sound, the resulting audio is breaking up in Plumble - and to a smaller level - official Mumble client. At the same, time, audio encoded with Mumble clients based on Gumble (talkiepi, talkkonnect) and when using any supported audio sample rates (48, 24, 16, 12, 8 kHz) and frame sizes (10, 20, 40, 60 ms) can be decoded near perfectly by all other Gumble based clients. No issues between Gumble clients on any audio sample rates or frame sizes.

For just Gumble to Gumble communication any audio sampling rates / frame sizes seem to work with no issues. But not with Plumble / official Mumble – in the direction from Gumble to them.

The problem could be related to jitter buffering in Plumble/ Mumble. Gumble is using AutoBitrate formula, based on server maximum bitrate (bitrate.go in gumbleutil). Plumble/ official Mumble seem to have a problem decoding bytes from Gumble produced audio frames (as defined in bitrate.go dataBytes formula) when the frames are longer than 10 ms.

Unless a workaround can be found in Gumble or a jitter buffer improved in Plumble (or official Mumble), communication between them and Mumble clients based on Gumble, with lower audio sampling rates and when using longer than 10 ms frames (20, 40, 60 ms) will continue not to work very well. On the other hand, 48 kHz sound with 10 ms works near perfectly between Gumble-like clients and Plumble/ official Mumble clients. But as soon as the time interval is extended over 10 ms, audio from Gumble will start breaking up in Plumble/ official Mumble.

Is there anything that can be done for making Gumble more compatible with Plumble/ official Mumble when it comes to working with longer frames (20, 40, 60 ms) for low bandwidth applications? Or should a fix be sought on Plumble/ official Mumble client side?

ghost commented 4 years ago

Audio encoded at lower sample rates from 48 kHz (24, 16, 12, 8) ...

Is this correct? Mumble only supports one sample rate: 48,000 Hz.

The problem could be related to jitter buffering in Plumble/ Mumble. Gumble is using AutoBitrate formula, based on server maximum bitrate (bitrate.go in gumbleutil). Plumble/ official Mumble seem to have a problem decoding bytes from Gumble produced audio frames (as defined in bitrate.go dataBytes formula) when the frames are longer than 10 ms.

The error very well could be in gumble's bitrate calculation. If gumble is sending more audio data per-second than is permitted by the server, the server will drop audio packets, which would result in choppy sound.

can be decoded near perfectly by all other Gumble based clients.

I find it interesting that you're not experiencing any issues receiving audio in gumble clients. I thought there was a stronger need for a jitter buffer for gumble, so I'm surprised you haven't seen that in your testing.

zrdimetc commented 4 years ago

Thank you for a quick answer. First off, big thanks and congratulations on creating Gumble, such a fantastic project deserving a lot more credit. With 48 kHz and 10 ms you hit the sweet spot for Gumble and other clients like Plumble/ official Mumble to happily talk to each other. In tests, we were clearly able to encode sound in Gumble with lower sound quality from 48 kHz (saving CPU) and to use longer time intervals (save bandwidth). This is working great for inter Gumble communication and not so perfect yet with other Mumble clients. We did this by changing AudioSampleRate in audio.go from 48000 Hz (to 24, 16, 12 or 8 kHz). Changes had a direct effect on lowering the CPU utilization. There is not so much difference in quality between 48 and 24 kHz, 16 kHz (used in talkiepi) gives usable quality and there is a noticeable quality drop with 12 and 8 kHz, but this still produced usable audio. Tests were conducted with Gumble clients (mostly talkkonnect fork) running in Raspberry Pi 3B/3B+, various Orange Pi boards, virtual machines, laptops and NUC's. It is important to note, we used a different formula for AudioDefaultInterval and AudioDefaultFrameSize (from talkiepi or talkkonnect). When it comes to the time intervals (AudioDefaultInterval from audio.go) combinations like this worked. 48 KHz / 10 ms (default) 24 kHz/ 20 ms 16 kHz /60 ms 12 kHz/ 40 ms 8 kHz / 60 ms (also 80, 100, 120) Please note that AudioDefaultFrameSize need to be 480 or in some cases could be 960. Those values work with most combinations of audio sampling rates and time intervals. Other values didn’t work to produce audible sound. AudioMaximumFrameSize could be anything above the default value, let’s say 2880. Interestingly, it was possible to produce audible results (between the Gumble clients) even when using 80, 100, 120 ms intervals for 8 kHz sound. Even if one Gumble client was set for 48 kHz sound / 10 ms and other for anything else, for instance 8 kHz/ 60 ms they could talk to each other. Mumble server max bitrate was set to 72000 bps. Also tried 96000 and 120000. Neither of the server max bitrates interfered with Gumble to drop any packets. Gumble 10 ms frames requiring the highest bandwidth with the server worked perfectly (with either Gumble or Plumble/ Mumble). It’s only the longer frames theoretically taking less bandwidth that had caused a choppy sound and only with Plumble/ less with official Mumble. We also forced Opus codec on the server. If you want to try to assess more closely the effect of not having adaptive jitter buffer built into Gumble yet (unlike Plumble and Mumble which are using jitter buffer from speexdsp) let me give you access to the cloud test server? Once the “sweet spot” is hit for Gumble, audio sounds truly amazing even without adaptive jitter buffer, only with static buffering. We use gumble ffmpeg function to run audio streams for testing. I also tested Gumble client on higher latency 3G network by mounting talkkonnect boxes (Gumble Raspberry Pi clients with LCD) in a car and driving around. Quality is more than acceptable. Adaptive jitter buffer would be great, but if we could successfully patch the compatibility between Gumble and Plumble on low bandwidth this will be very useful to the Gumble users community. Thank you again. Regards.

zrdimetc commented 4 years ago

Hi Tim, I just wanted to let you know this issue can probably be closed. Reasons for audio breaking up can clearly be packet loss and jitter. But if they can be ruled out, then Gumble doesn't have any issues with audio breaking up, thanks to a fantastic work you have done to create it. Gumble based client to other Gumble based client communication works perfect. In what other cases audio breaking up can be observed? If the clients based on Gumble (talkiepi is popular) are used to talk to Android app like Plumble, most people will be able to witness audio breaking up (direction Gumble to Plumble). This will also happen if the Gumble client is using a lower than 48 kHz audio sampling rate (24, 16, 12, 8), where the users are maybe trying to lower the CPU load and bandwidth. talkiepi is using 16 kHz audio. Plumble can not decode such frames very well. The reason for this is AudioHandler in Plumble's Jumble library set for 48 kHz. Plumble input audio sampling rate can be changed with a slider, but not the output rate. To a much smaller extent, audio breaking up can be observed between Gumble based clients using lower than 48 KHz audio sampling and the official Mumble client. But it is not as bad as with Plumble. Maybe this can be adjusted on the official Mumble side? Bottom line, if a Gumble based client is using 48 kHz audio sampling and network issues can be ruled out, audio from Gumble client to any other clients is perfect. In other cases, where lower audio sampling rates are used to talk to other Mumble clients and where audio is breaking up, this is not caused by Gumble - the problem is on Plumble and official Mumble side. Decoded audio between Gumble based clients using any combination of audio sampling rates supported in Opus works perfectly.

DauTorsten commented 3 years ago

Hi all,

I'm also experiencing the described problem with my local setup: https://github.com/mnoonan296/talkiepi is sending to Mumble-Server (Raspberry Pi3) and I'm receiving a choppy audio on Mumla, all devices connected to the same WiFi-network. I can confirm, that the other way around works fine (Mumla->talkiepi).

zrdimetc, you suggested, that the problem is on Plumble side, but I it looks like this is questioned by streaps who says, that opus is using 48kHz independent from the sample rate of the in-feeding audio stream. (on gitlab https://gitlab.com/quite/mumla/-/issues/41#note_347865045)

It seems that the discussion got stuck here and I am very sad about the fact, that I'm just a user having a problem but not the skills for contributing in a more constructive way. I have deep respect for all, who do this voluntary work and get things done and it would be very sad if there are people out there who are willing to resolve the issue, but it is not clear on which "side" the issue is. A clarification would be gorgeous :)

Thanks and best regards Torsten

zrdimetc commented 3 years ago

Hi Torsten, if your only issue is making talkiepi audio work smoothly with Plumble/ Mumla, then I suggest you use audio.go from Gumble. Just replace talkiepi audio.go and recompile. Your problem will be solved. Did you try talKKonnect? Regards.

DauTorsten commented 3 years ago

Hi zrdimetc,

thanks for your reply. I managed to recompiled talkiepi as described by you, but it looks like this is not a solution which works in my setup. When I press the tty button, talkiepi disconnects from the server with the following message:

Button is pressed [255 0 0 0 255 0 0 0 255] Button is released ALSA lib pcm_rate.c:1356:(snd_pcm_rate_open) Invalid type for rate converter [255 0 0 0 255 0 0 0 0] After that it reconnects.

If you have a spontaneous idea why this happens, please let me know. I fear it can have multiple reasons because of the specific setup (talkiepi with Pi Zero +ReSpeaker 2-Mic HAT) and I suggest, that a long search for the bug would be a bit off topic here. Especially because my linux and programming skills are rather poor.

I didn't try talkKonnect, yet. It looks promising, but I suggest it needs a lot more customization to serve my needs than an "of the shelf" talkiepi.

Nevertheless thanks for your support! BR Torsten

zrdimetc commented 3 years ago

Hi Torsten, I can only suggest that Raspberry Pi Zero is just not such a good platform for building talkiepi, or other Gumble based projects. Neither is Re-Speaker 2-Mic Hat a good part pick. It may sound like a good idea at first, but both pieces of hardware have serious shortcomings for working with Gumble. That is if you want your build to work with Plumble or Mumla too? Re-Speaker kernel driver is missing some rate converters. This is why you get the errors. You need to use AudioSampleRate = 48000 in Gumble audio.go for audio to be decoded smoothly in Plumble. It's why talkiepi is breaking up with Plumble (AudioSampleRate = 16000). This problem can not be solved in Gumble or talkiepi. This is an issue with Plumble. So if you go with higher AudioSampleRate = 48000, then the tiny Raspberry Pi Zero will run with nearly 100 % CPU utilization and be prone to crashing. It's just not powerful enough. You will also have a challenge of compiling OpenAl libraries without Neon support to run on Debian Buster. Although there had been projects building talkiepi with RPi Zero and Re-Speaker successfully, they will work poorly with Plumble, as far as the quality of audio... they will work fine Gumble to Gumble device. The best course of action I can think of for solving your problem is ... consider discontinuing Raspberry Pi Zero and Re-Speaker project. Instead, use Raspberry Pi 3B+ and any USB sound card. Raspberry Pi 3A+ will also work. Or RPi 4, which is an overkill. Alternatively, use Orange Pi board, for instance Orange Pi Zero LTS (if you want a small, low-cost build) which comes with a sound codec of it's own. There is a USB expansion hat with mic for Orange Pi Zero. Keep in mind, GPIO button pins in Orange Pi need pull-up resistors. Raspberry Pi buttons do not. I did build with RPi Zero and Re-Speaker and had given up. All my builds are Raspberry Pi 3B+ and Orange Pi's. In some cases virtual machines and PC's. Good luck, I hope you solve your problem with a different build strategy and within a budget for your project?

sensboston commented 3 years ago

@zrdimetc, I disagree with you: I just built two house intercoms on RPiZ+ReSpeaker, and both are working pretty fine (btw, I did small refactor of takiepi code for proper work with GPIO, and added pin controlling events for incoming audio stream - my wife asked especially for this feature 'cause our oldest daughter usually playing Nintendo Switch in headphones, but now she will be able to see colorful LED strip animations when mom called her from kitchen too :wink:) I also set sample rate for talkiepi to 24000 samples per second.

As for Plumble Android client: free version I've installed from play store, has settings, and you can adjust audio settings too, including sample rate and and buffer size. After playing for a little with audio settings, I got an affordable, non-choppy sound quality.

To give RPiZ some "rest", I moved mumble server (murmur) to my Windows-based home server but initially it worked on the same RPiZ 😉

Here are my assemblies:

image

I reused non-properly worked (a lot of noise and radio interference) analog intercom cases I've purchased on eBay (but seller issued a full refund).

zrdimetc commented 3 years ago

Very good @sensboston. Maybe you can offer Torsten a practical advice, if not here then in talkiepi github or site? He needs to know how to solve the rate converter problem with Re-speaker hat and compile OpenAl without Neon support in Debian 10 because these issues were a show-stopper? And then he can follow your example to adjust audio quality?

As far as the audio quality, I didn’t say you can not use talkiepi audio sample rate of 16000 or tweak to 24000 like you did. Supported rates are 8000, 12000, 16000, 24000, 48000. Use any. Tried them all. For short voice transmissions, it really doesn’t matter if audio jumps and breaks sometimes. A voice message will be passed. It’s just unpleasant to listen sometimes. To spot a problem you need to stream music with gumble ffmpeg or play a continuous audio test tone, let’s say 1 kHz sine wave and while ensuring your network is near lossless, the audio will evidently be breaking up in a direction to Plumble. If you run continuous audio stream for some time you will surely notice. Some people will also easily notice quality issues with voice. Most builders won’t mind. Between gumble forks you can use any settings you like without audio quality issues. But there are dozens of issues raised by talkiepi builders about “choppy sound”. Every time the builders reported the same issue. Talkiepi audio quality with Plumble. Why was it breaking? This is why it was raised here as well, to try to understand the logic behind Gumble audio encoding / decoding, OpenAl role and how it may affect working with other Mumble clients or Gumble forks, so that Gumble creator can assess if anything needed to be adjusted in Gumble? For the reported issue, not really. That particular issue is not Gumble’s fault. Gumble creator did a fantastic job like a true genius!

I tested audio quality of Gumble forks for quite a few hours, for over two years. Suvir Kumar made considerable changes in his gumble/talkiepi fork talkkonnect resulting with amazing improvements as far as the audio quality, functionalities and stability of code. Maybe some of his changes could be accepted by the creator or Gumble?

As far as the adjustments in Plumble for audio sample rate, app slider function is only adjusting the rate for encoding/ transmitting. Again, it doesn’t matter for Plumble talking to Plumble. Issue has always been Gumble fork talking to Plumble, once the audio sample rate in Gumble is set bellow 48000. I think Gumble creator could close this issue if nobody else has more relevant findings to contribute to the “choppy sound” issue or try to suggest it's a Gumble issue? It's good the question was asked, but the "choppy sound" it's not Gumble's fault. Practical build issues for Gumble forks are probably best addressed at the fork’s sites, if everybody agree?

sensboston commented 3 years ago

Maybe you can offer Torsten a practical advice

I don't think so: it looks like he did very same as I (I fixed only talkiepi-s "weak" GPIO part, by completely removing github.com/dchote/gpio and re-using github.com/stianeikeland/go-rpio; I'm ready to issue a PR but looks like @dchote haven't appeared here for a long time). I did nothing special (about audio quality) but of course I've tested short phrases only (haven't streamed music of course). Might be, some audio glitches still exist but I don't really care 😜

You said:

So if you go with higher AudioSampleRate = 48000, then the tiny Raspberry Pi Zero will run with nearly 100 % CPU utilization and be prone to crashing.

and

The best course of action I can think of for solving your problem is ... consider discontinuing Raspberry Pi Zero and Re-Speaker project.

I disagree with both statements; what's the only reason I left my comment. RPi Z + talkiepi works just fine on 48Ks/s but that sample rate is too much for the voice, 16K or 24K (he-he, we're all humans 😉) is enough.

By the way, initially, when I found audio glitches (choppy sound) by testing my first RPi intercom from Android phone, I thought it's talkiepi code/algorithm issue(s) - because I already found & fixed issues with GPIO. But after reading your comments (above in this thread), I understand, issue source is much deeper/complicated than simple bugs or incorrect implementation. But it's irrelevant for me now; my inhouse intercom works just fine and I finished this project by creating sdcard images from both devices (just in case).