Closed toby63 closed 4 years ago
How would you detect active pa modules / audio filters? sink or source name might be very unreliable.
If it's no 100% reliable it's worse than doing nothing (IMHO).
@streaps
How would you detect active pa modules / audio filters? sink or source name might be very unreliable.
Well, I guess further testing is necessary.
My first impression was: it would be reliable in being correct (so no false alerts), because if modified names in sink or source appear, than we can reasonably assume that filters are applied. Examples:
Open questions:
Do you have echo cancel filters active?
In that case a warning is both good and bad. Good because the user is warned about a situation he was not aware of and bad because the user might not know anything about pulse modules etc.
Alternatives to name check: Maybe there are APIs to check for that. But that could be considered intruding and to complex for this purpose.
I have the feeling that users that are using this kind of stuff are (as you said) advanced enough to know what they're doing and thus when the audio gets distorted, they probably know what to look for. On the other hand a "normal" user might get very frustrated if Mumble won't let him turn e.g. echo cancellation on because it thinks that some other filter is active. I can already see the reports and complaints from this.
Maybe the best thing would be to create a wiki page about this topic so that advanced users can refer to that to see which effects are built into Mumble and where and how to disable/configure them.
@Krzmbrzl
On the other hand a "normal" user might get very frustrated if Mumble won't let him turn e.g. echo cancellation on because it thinks that some other filter is active.
To clarify, I only requested a notification, no auto-disable etc.
Maybe the best thing would be to create a wiki page about this topic so that advanced users can refer to that to see which effects are built into Mumble and where and how to disable/configure them.
Thats agreeable :+1: .
But in addition I propose two things:
information about echo cancel in the audio wizard (or a new startup wizard) would be good
Well atm echo cancellation is disabled by default, so I don't think that's an issue. I wanted to create a new startup-wizard at some point though anyways that'd probably also contain the possibility to enable echo cancellation...
an info icon could be added next to the echo cancel menu to show some information and that could include a link to the wiki page you proposed and/or even a notification about this problem
Uhm I guess that wouldn't hurt...
@Krzmbrzl I created a potencial implementation list in the first post. Edit it as you please :wink: .
@toby63 great :+1: Any chance you find the motivation to create such a wiki page? I think you know best what exactly you expect on a page like this :)
Any chance you find the motivation to create such a wiki page? I think you know best what exactly you expect on a page like this :)
Well I could do that, but first we (or I) need more information and that can only be delivered by you (the devs) (or someone with coding skills would need to dig through that all, but thats unconvenient). In addition to that I am no expert at this audio stuff (@streaps for example seems to know more about this :wink:).
Note: I want to clarify this is of course not urgent, but a long-term goal.
Regarding the Info needed:
are there more?
No idea but if we find more, we can always extend the wiki page :point_up:
is rnnoise disabled by default?
you mean "RNNoise", don't you? That option is disabled by default.
are there more (hidden/not implemented) options?
I don't think so... :thinking:
And as for the missing info for the difference between the echo cancellation modes: We can list the available ones for now and add the explanation once we have it :shrug:
(optional) further details:
This info would be very susceptible to getting outdated. That's nothing I want in a wiki entry because if someone changes the code, that person will almost certainly forget to update the wiki...
have you (the devs) any data on this (internal tests or external)?
I don't. Maybe someone else has :shrug:
are there plans to change something?
No actual plans as of now
For what concerns the audio input filtering, while trying to fix the echo canceller I found what I think is the relevant part of the code, and it is in AudioInput.cpp#L837, while the configuration is in AudioInput.cpp#L644.
I don't know if there's more stuff that's well hidden, but since I've tried to follow the audio path from the platform dependent code up to the encoder, I believe it's all here, and it's less than 100 lines of code.
So apparently, what happens on the microphone stream is this:
Now, what's rnnoise? I don't know, all I see is a function that's in the audio path.
And concerning the echo cancellation in pulseaudio.
I tried it once. It creates an additional microphone input that's (supposed to be) without echo, and users have to explicitly select that audio source in every application they want to use with it.
However on my machine with mumble it just worsened the echo. No wonder it's disabled by default. Has anyone had any luck with it?
This looks like stuff is just patched together without any concept of how to use these different filters properly. Or am I reading it wrong?
I see rnnoise and then again the speexdsp's DENOISE
Is speexdsp VAD used by the Voice Activity input mode? Or do we have two VADs?
rnnoise before echo cancellation?
...
Not really, to me at least.
The libspeexdsp just looks like someone enabled a reasonable set of dsp stuff.
Rnnoise is a bit patched in without much thought. Especially because it does a cast to float and back while up ahead in the pipeline everything was already float for mixing, and then got converted to short. These continuous casts hurt performance.
Do someone use Rnnoise? What is it good for? Can it even be removed as it looks redundant compared to libspeexdsp's noise canceller.
For the record, if you want to see the effect of libspeexdsp's noise canceller, do a before/after comparison using --dump-input-streams
, it's really quite good.
libspeexdsp's VAD is what gets used if you select Voice activity and then signal to noise in the audio input configuration. See https://github.com/mumble-voip/mumble/blob/master/src/mumble/AudioInput.cpp#L874
However, I may need to add that every time you run mumble you get this message from the command line
warning: The VAD has been replaced by a hack pending a complete rewrite
This message appears to come from libspeexdsp itself, so it's not even Mumble's fault.
Anyway, it's no surprise automagic voice detection sucks and users are forced to carefully set the voice activation thresholds.
Do someone use Rnnoise? What is it good for? Can it even be removed as it looks redundant compared to libspeexdsp's noise canceller.
I would say the libspeexdsp noise canceller is redundant to rnnoise ;). It's created buy the guy who created CELT and is one of the authors of the Opus RFC.
There was a blog post and a demo on the web with a comparison of libspeexdsp and rnnoise. Unfortunately the server is not reachable anymore.
This demo presents the RNNoise project, showing how deep learning can be applied to noise suppression. The main idea is to combine classic signal processing with deep learning to create a real-time noise suppression algorithm that's small and fast. No expensive GPUs required — it runs easily on a Raspberry Pi. The result is much simpler (easier to tune) and sounds better than traditional noise suppression systems (been there!).
Oh, so that's what is rnnoise.
It could be possible to integrate it better, such as disable libspeexdsp's noise canceller when rnnoise is in use, and maybe avoid the wasteful cast to float and/or move it after the echo canceller.
In any case, I'm not fully convinced. Not a fan of machine learning stuff. Also, saying that it runs on a raspberry isn't enough to say it's fast. How much CPU would it take on a raspberry? 10% or close to 100%?
Would be nice if the demo was still available.
I found some audio samples
original: https://github.com/jagger2048/WebRtc_noise_suppression/blob/master/assets/babble_15dB.wav
processed: https://github.com/jagger2048/WebRtc_noise_suppression/tree/master/assets/test_case
Interesting comparison indeed.
Rnnoise does seem to remove background noise better but when the speaker talks loudly, some of the background noise suddenly gets through and appears to come out of nowhere, which I personally find annoying.
Libspeexdsp's noise canceller is quite good too.
The one I like less is webrtc's one. It muffles the audio and also reduces its amplitude, it doesn't seem as good as the other two.
I agree keeping rnnoise is a good idea, but I wouldn't enable it by default, and it also needs to be integrated better in the Mumble codebase. Points to watch out for is whether disabling libspeexdsp's noise canceller to avoid having both it and rnnoise at the same time impacts other parts of libspeexdsp's dsp chain, such as the agc.
@fedetft
Points to watch out for is whether disabling libspeexdsp's noise canceller to avoid having both it and rnnoise at the same time impacts other parts of libspeexdsp's dsp chain, such as the agc.
So right now libspeexdsp's noise canceller is active when the echo canceller is active? So RNNoise is not necessary (though slightly better, if integrated correctly)?
Regarding topic "echo cancel in pulseaudio" above:
However on my machine with mumble it just worsened the echo. No wonder it's disabled by default. Has anyone had any luck with it?
I tried it two times in recent weeks, I used pulseeffects though (I don't know if that makes a difference, but at least it automatically "bundles" all streams through the program and you have one sink and source named "pulseeffects etc." which I chose in mumble). It seemed to work, but my VoIP-Partners gave me no detailed feedback whether it was really good, so I can't know for sure.
libspeexdsp's noise canceller is unconditionally active in Mumble.
How much CPU would it take on a raspberry? 10% or close to 100%?
"A non-vectorized C implementation of the algorithm requires around 1.3% of a single x86 core (Haswell i7-4800MQ) to perform 48 kHz noise suppression of a single channel. The real-time complexity of the same floating-point code on a 1.2 GHz ARM Cortex-A53 core (Raspberry Pi 3) is 14%."
https://arxiv.org/pdf/1709.08243.pdf
14% is not insignificant. I'm not sure if it's the same for the recent implementation of rnnoise.
I agree, the audio signal flow should look more like this:
Mic Input
|
·—— 1) AEC mixed / mono ——.
2) AEC multi-channel |
3) bypass |
|
.—————————————————————————·
|
| 1) rnnoise
·—— 2) speex denoise —————.
3) bypass |
|
.—————————————————————————·
|
| 1) voice activation
·—— 2) PTT ———————————————.
3) continuous (bypass)|
|
.—————————————————————————·
|
·—— Opus encoder
@fedetft thank you for looking into the audio code. If you feel like you could make improvements to it, please create PRs. We're always looking to improve the system and someone like you who seems to know the audio stuff quite well might be able to improve the respective code areas quite a bit :)
Regarding RNNoise, please read my comment here: https://github.com/mumble-voip/mumble/issues/4181#issuecomment-630147296
I created a wiki page for Audio: https://wiki.mumble.info/wiki/Audio
Feel free to add or change something. If you don't want to edit it yourself, post suggestions here :slightly_smiling_face:.
As I don't see the info icon happening without a UI rewrite (adding info icons to multiple places), I don't think this should be part of this issue here.
And for the startup wizard I added a comment about it to #4137
Therefore I'd say this issue is done. Thanks for adding the info to the wiki page @toby63 :)
Two things:
As I don't see the info icon happening without a UI rewrite
So I am no programmer, but does that really require a major rewrite? I thought it could be as "easy" as adding buttons in Qt Creator (or another Qt (UI)-editor) and then work out the details for the buttons in the code. To clarify: I don't want to push this, but it is also not something "delayed until forever, because it's too complicated".
Because there are multiple other issues still open, that are far less important than this.
That's not really a reason
So I am no programmer, but does that really require a major rewrite?
"Rewrite" was the wrong term here. What I really meant was "redesign". The current UI was just not designed with something like this in mind and if we want these boxes, we should either introduce them everywhere (where it makes sense) or don't add it. But if we have it for a single entry, then that'd be weird.
And for UI overhaul we have e.g. #3608 (and I think there are others ghosting around as well). You can add your idea about info-icons to that if you wish. In the end though I think that in the next time, no one will have time for actually doing the UI overhaul (no one of the core team that is)
That's not really a reason
But an argument.
The current UI was just not designed with something like this in mind and if we want these boxes, we should either introduce them everywhere (where it makes sense) or don't add it. But if we have it for a single entry, then that'd be weird.
I disagree.
If you imagine a small button like this:
(?)
in the right corner of the section; it is very convenient, not confusing and doesn't need much redesign.
These help-buttons are also known from other programs, so nothing new for most users.
I accept that you don't have time to do that now. Maybe I will try to set up something, to test whether it is as "easy" as I think it is.
And for UI overhaul we have e.g. #3608 (and I think there are others ghosting around as well)
Such a broad topic is not really useful. And it also consists of many smaller issues.
I think the first variant is the best if the button will link to the general audio wiki page.
Now you just have to connect the actions and create a PR :)
I think the first variant is the best if the button will link to the general audio wiki page.
Well, I think it would not be good if it just links to the wiki page, it should contain some information on its own and can link to the wiki for additional information.
Now you just have to connect the actions and create a PR :)
I just wanted to show the possibility for now, but I will think about that. As I am no programmer this will take me forever and I got some other things to do right now.
This topic is not urgent, I just wanted to show that it does not need so many changes.
As I am no programmer this will take me forever and I got some other things to do right now.
The changes should be really simple (depending on what you have in mind). Might be a good opportunity to get into ;)
Update: More Information for users seems to be the preferred option.
Potencial Implementation List:
Original Report Title: Check for external audio filters
Context: Audio filters
Description: Check if external audio filters (like echo cancel) are already in use (e.g. in pulseaudio) and show that in the UI.
Alternatives:
Additional Info: In some applications the overlapping use of audio filters can have distorting, disturbing & repealing effects.