[Feature Request] better description for audio filters

toby63 commented 4 years ago

Update: More Information for users seems to be the preferred option.

Potencial Implementation List:

[x] Add info to the wiki page about Audio:
- (partly done) What audio filters are included
- (partly done) Usecases (aka when are they useful)
- (partly done) configuration options
- (partly done) How to enable/disable them
- How do they perform quality-wise (imo some are better, for example webrtc (can be activated in pulseaudio))
- (partly done) note about conflict with external audio filters
[ ] add info icon next to the echo cancel-/audio filter- options:
- short description
- link to wiki
- (maybe) notice about conflict with external audio filters
[ ] add section to new startup wizard about audio filters
- short description
- short options; enable/disable
- link to wiki

Original Report Title: Check for external audio filters

Context: Audio filters

Description: Check if external audio filters (like echo cancel) are already in use (e.g. in pulseaudio) and show that in the UI.

Alternatives:

Explain to users (maybe only at first start), that mumble has implemented audio filters (also which) and that they might disable them if they use filters otherwise.
Maybe only advanced users use such a scenario and will know what to do.

Additional Info: In some applications the overlapping use of audio filters can have distorting, disturbing & repealing effects.

Implementation Details:
- Of course mumble might not be able to check for every possible software that includes audio filters, but at least it could check for some, including:
  - pulseaudio
  - pulseeffects
- Whether audio filters are enabled might be already visible in the names of sink and source, so mumble could just check the names.

streaps commented 4 years ago

How would you detect active pa modules / audio filters? sink or source name might be very unreliable.

If it's no 100% reliable it's worse than doing nothing (IMHO).

toby63 commented 4 years ago

@streaps

How would you detect active pa modules / audio filters? sink or source name might be very unreliable.

Well, I guess further testing is necessary.

My first impression was: it would be reliable in being correct (so no false alerts), because if modified names in sink or source appear, than we can reasonably assume that filters are applied. Examples:

pulseeffects: if the name contains "pulseeffects", we can assume that filters are applied because thats what pulseffects is there for
pulseaudio: if the name is modified (like echo_cancel or something else) audio filters can be assumed

Open questions:

Are there modules in pulseaudio that modify the name that are no audio filters? Maybe e.g. a null_sink. If thats the case we would need to specify the name options.
If filters are used that do not interfere with mumbles filters, that could be missleading. But that could be specified in the message that is given to the user: e.g. Do you have echo cancel filters active?
Is there a usecase where a user might not be aware of used filters? Like:
- do some distros activate that by default?
  - does windows use something by default?

In that case a warning is both good and bad. Good because the user is warned about a situation he was not aware of and bad because the user might not know anything about pulse modules etc.

Alternatives to name check: Maybe there are APIs to check for that. But that could be considered intruding and to complex for this purpose.

Krzmbrzl commented 4 years ago

I have the feeling that users that are using this kind of stuff are (as you said) advanced enough to know what they're doing and thus when the audio gets distorted, they probably know what to look for. On the other hand a "normal" user might get very frustrated if Mumble won't let him turn e.g. echo cancellation on because it thinks that some other filter is active. I can already see the reports and complaints from this.

Maybe the best thing would be to create a wiki page about this topic so that advanced users can refer to that to see which effects are built into Mumble and where and how to disable/configure them.

toby63 commented 4 years ago

@Krzmbrzl

On the other hand a "normal" user might get very frustrated if Mumble won't let him turn e.g. echo cancellation on because it thinks that some other filter is active.

To clarify, I only requested a notification, no auto-disable etc.

Maybe the best thing would be to create a wiki page about this topic so that advanced users can refer to that to see which effects are built into Mumble and where and how to disable/configure them.

Thats agreeable :+1: .

But in addition I propose two things:

information about echo cancel in the audio wizard (or a new startup wizard) would be good
an info icon could be added next to the echo cancel menu to show some information and that could include a link to the wiki page you proposed and/or even a notification about this problem

Krzmbrzl commented 4 years ago

information about echo cancel in the audio wizard (or a new startup wizard) would be good

Well atm echo cancellation is disabled by default, so I don't think that's an issue. I wanted to create a new startup-wizard at some point though anyways that'd probably also contain the possibility to enable echo cancellation...

an info icon could be added next to the echo cancel menu to show some information and that could include a link to the wiki page you proposed and/or even a notification about this problem

Uhm I guess that wouldn't hurt...

toby63 commented 4 years ago

@Krzmbrzl I created a potencial implementation list in the first post. Edit it as you please :wink: .

Krzmbrzl commented 4 years ago

@toby63 great :+1: Any chance you find the motivation to create such a wiki page? I think you know best what exactly you expect on a page like this :)

toby63 commented 4 years ago

Any chance you find the motivation to create such a wiki page? I think you know best what exactly you expect on a page like this :)

Well I could do that, but first we (or I) need more information and that can only be delivered by you (the devs) (or someone with coding skills would need to dig through that all, but thats unconvenient). In addition to that I am no expert at this audio stuff (@streaps for example seems to know more about this :wink:).

Note: I want to clarify this is of course not urgent, but a long-term goal.

Regarding the Info needed:

What audio filters are included? And details about them: I see three (are there more?):
- the two echo cancel options (need further explanation, see #4125 )
- rnnoise (no info in the wiki found)
  - is rnnoise disabled by default?
configuration options for the audio filters:
- explain existing/visible ones
- are there more (hidden/not implemented) options?
(optional) further details:
- how is this implemented (maybe there are specific options/functions etc. that were chosen)?
- how recent is the software included?
(additional) How do they perform quality-wise:
- have you (the devs) any data on this (internal tests or external)?
(additional) Future:
- are there plans to change something?

Krzmbrzl commented 4 years ago

are there more?

No idea but if we find more, we can always extend the wiki page :point_up:

is rnnoise disabled by default?

you mean "RNNoise", don't you? That option is disabled by default.

are there more (hidden/not implemented) options?

I don't think so... :thinking:

And as for the missing info for the difference between the echo cancellation modes: We can list the available ones for now and add the explanation once we have it :shrug:

(optional) further details:

This info would be very susceptible to getting outdated. That's nothing I want in a wiki entry because if someone changes the code, that person will almost certainly forget to update the wiki...

have you (the devs) any data on this (internal tests or external)?

I don't. Maybe someone else has :shrug:

are there plans to change something?

No actual plans as of now

fedetft commented 4 years ago

For what concerns the audio input filtering, while trying to fix the echo canceller I found what I think is the relevant part of the code, and it is in AudioInput.cpp#L837, while the configuration is in AudioInput.cpp#L644.

I don't know if there's more stuff that's well hidden, but since I've tried to follow the audio path from the platform dependent code up to the encoder, I believe it's all here, and it's less than 100 lines of code.

So apparently, what happens on the microphone stream is this:

first, rnnoise is applied if enabled
then, the speexdsp echo cancellation is performed, if enabled
finally, the speexdsp filter pipeline (VAD, AGC, DENOISE, DEREVERB appear to be enabled) is run

Now, what's rnnoise? I don't know, all I see is a function that's in the audio path.

fedetft commented 4 years ago

And concerning the echo cancellation in pulseaudio.

I tried it once. It creates an additional microphone input that's (supposed to be) without echo, and users have to explicitly select that audio source in every application they want to use with it.

However on my machine with mumble it just worsened the echo. No wonder it's disabled by default. Has anyone had any luck with it?

streaps commented 4 years ago

This looks like stuff is just patched together without any concept of how to use these different filters properly. Or am I reading it wrong?

I see rnnoise and then again the speexdsp's DENOISE

Is speexdsp VAD used by the Voice Activity input mode? Or do we have two VADs?

rnnoise before echo cancellation?

...

fedetft commented 4 years ago

Not really, to me at least.

The libspeexdsp just looks like someone enabled a reasonable set of dsp stuff.

Rnnoise is a bit patched in without much thought. Especially because it does a cast to float and back while up ahead in the pipeline everything was already float for mixing, and then got converted to short. These continuous casts hurt performance.

Do someone use Rnnoise? What is it good for? Can it even be removed as it looks redundant compared to libspeexdsp's noise canceller.

For the record, if you want to see the effect of libspeexdsp's noise canceller, do a before/after comparison using --dump-input-streams, it's really quite good.

fedetft commented 4 years ago

libspeexdsp's VAD is what gets used if you select Voice activity and then signal to noise in the audio input configuration. See https://github.com/mumble-voip/mumble/blob/master/src/mumble/AudioInput.cpp#L874

However, I may need to add that every time you run mumble you get this message from the command line

warning: The VAD has been replaced by a hack pending a complete rewrite

This message appears to come from libspeexdsp itself, so it's not even Mumble's fault.

Anyway, it's no surprise automagic voice detection sucks and users are forced to carefully set the voice activation thresholds.

streaps commented 4 years ago

Do someone use Rnnoise? What is it good for? Can it even be removed as it looks redundant compared to libspeexdsp's noise canceller.

I would say the libspeexdsp noise canceller is redundant to rnnoise ;). It's created buy the guy who created CELT and is one of the authors of the Opus RFC.

There was a blog post and a demo on the web with a comparison of libspeexdsp and rnnoise. Unfortunately the server is not reachable anymore.

This demo presents the RNNoise project, showing how deep learning can be applied to noise suppression. The main idea is to combine classic signal processing with deep learning to create a real-time noise suppression algorithm that's small and fast. No expensive GPUs required — it runs easily on a Raspberry Pi. The result is much simpler (easier to tune) and sounds better than traditional noise suppression systems (been there!).

https://jmvalin.dreamwidth.org/15210.html

fedetft commented 4 years ago

Oh, so that's what is rnnoise.

It could be possible to integrate it better, such as disable libspeexdsp's noise canceller when rnnoise is in use, and maybe avoid the wasteful cast to float and/or move it after the echo canceller.

In any case, I'm not fully convinced. Not a fan of machine learning stuff. Also, saying that it runs on a raspberry isn't enough to say it's fast. How much CPU would it take on a raspberry? 10% or close to 100%?

Would be nice if the demo was still available.

streaps commented 4 years ago

I found some audio samples

original: https://github.com/jagger2048/WebRtc_noise_suppression/blob/master/assets/babble_15dB.wav

processed: https://github.com/jagger2048/WebRtc_noise_suppression/tree/master/assets/test_case

fedetft commented 4 years ago

Interesting comparison indeed.

Rnnoise does seem to remove background noise better but when the speaker talks loudly, some of the background noise suddenly gets through and appears to come out of nowhere, which I personally find annoying.

Libspeexdsp's noise canceller is quite good too.

The one I like less is webrtc's one. It muffles the audio and also reduces its amplitude, it doesn't seem as good as the other two.

I agree keeping rnnoise is a good idea, but I wouldn't enable it by default, and it also needs to be integrated better in the Mumble codebase. Points to watch out for is whether disabling libspeexdsp's noise canceller to avoid having both it and rnnoise at the same time impacts other parts of libspeexdsp's dsp chain, such as the agc.

toby63 commented 4 years ago

@fedetft

Points to watch out for is whether disabling libspeexdsp's noise canceller to avoid having both it and rnnoise at the same time impacts other parts of libspeexdsp's dsp chain, such as the agc.

So right now libspeexdsp's noise canceller is active when the echo canceller is active? So RNNoise is not necessary (though slightly better, if integrated correctly)?

Regarding topic "echo cancel in pulseaudio" above:

However on my machine with mumble it just worsened the echo. No wonder it's disabled by default. Has anyone had any luck with it?

I tried it two times in recent weeks, I used pulseeffects though (I don't know if that makes a difference, but at least it automatically "bundles" all streams through the program and you have one sink and source named "pulseeffects etc." which I chose in mumble). It seemed to work, but my VoIP-Partners gave me no detailed feedback whether it was really good, so I can't know for sure.

fedetft commented 4 years ago

libspeexdsp's noise canceller is unconditionally active in Mumble.

streaps commented 4 years ago

How much CPU would it take on a raspberry? 10% or close to 100%?

"A non-vectorized C implementation of the algorithm requires around 1.3% of a single x86 core (Haswell i7-4800MQ) to perform 48 kHz noise suppression of a single channel. The real-time complexity of the same floating-point code on a 1.2 GHz ARM Cortex-A53 core (Raspberry Pi 3) is 14%."

https://arxiv.org/pdf/1709.08243.pdf

14% is not insignificant. I'm not sure if it's the same for the recent implementation of rnnoise.

streaps commented 4 years ago

I agree, the audio signal flow should look more like this:

 Mic Input
 |
 ·—— 1) AEC mixed / mono ——.
     2) AEC multi-channel  |
     3) bypass             |
                           |
 .—————————————————————————·
 |
 |   1) rnnoise
 ·—— 2) speex denoise —————.
     3) bypass             |
                           |
 .—————————————————————————·
 |
 |   1) voice activation
 ·—— 2) PTT ———————————————.
     3) continuous (bypass)|
                           |
 .—————————————————————————·
 |
 ·—— Opus encoder

Krzmbrzl commented 4 years ago

@fedetft thank you for looking into the audio code. If you feel like you could make improvements to it, please create PRs. We're always looking to improve the system and someone like you who seems to know the audio stuff quite well might be able to improve the respective code areas quite a bit :)

TredwellGit commented 4 years ago

Regarding RNNoise, please read my comment here: https://github.com/mumble-voip/mumble/issues/4181#issuecomment-630147296

toby63 commented 4 years ago

I created a wiki page for Audio: https://wiki.mumble.info/wiki/Audio

Feel free to add or change something. If you don't want to edit it yourself, post suggestions here :slightly_smiling_face:.

Krzmbrzl commented 4 years ago

As I don't see the info icon happening without a UI rewrite (adding info icons to multiple places), I don't think this should be part of this issue here.

And for the startup wizard I added a comment about it to #4137

Therefore I'd say this issue is done. Thanks for adding the info to the wiki page @toby63 :)

toby63 commented 4 years ago

Two things:

I would leave it open, as a reminder for icons, or open another issue for that. Because there are multiple other issues still open, that are far less important than this.
As I don't see the info icon happening without a UI rewrite

So I am no programmer, but does that really require a major rewrite? I thought it could be as "easy" as adding buttons in Qt Creator (or another Qt (UI)-editor) and then work out the details for the buttons in the code. To clarify: I don't want to push this, but it is also not something "delayed until forever, because it's too complicated".

Krzmbrzl commented 4 years ago

Because there are multiple other issues still open, that are far less important than this.

That's not really a reason

So I am no programmer, but does that really require a major rewrite?

"Rewrite" was the wrong term here. What I really meant was "redesign". The current UI was just not designed with something like this in mind and if we want these boxes, we should either introduce them everywhere (where it makes sense) or don't add it. But if we have it for a single entry, then that'd be weird.

And for UI overhaul we have e.g. #3608 (and I think there are others ghosting around as well). You can add your idea about info-icons to that if you wish. In the end though I think that in the next time, no one will have time for actually doing the UI overhaul (no one of the core team that is)

toby63 commented 4 years ago

That's not really a reason

But an argument.

The current UI was just not designed with something like this in mind and if we want these boxes, we should either introduce them everywhere (where it makes sense) or don't add it. But if we have it for a single entry, then that'd be weird.

I disagree. If you imagine a small button like this: (?) in the right corner of the section; it is very convenient, not confusing and doesn't need much redesign. These help-buttons are also known from other programs, so nothing new for most users.

I accept that you don't have time to do that now. Maybe I will try to set up something, to test whether it is as "easy" as I think it is.

And for UI overhaul we have e.g. #3608 (and I think there are others ghosting around as well)

Such a broad topic is not really useful. And it also consists of many smaller issues.

toby63 commented 4 years ago

To show a concept I added a push button with a question mark: Example 1 Example 2 Example 3

Krzmbrzl commented 4 years ago

I think the first variant is the best if the button will link to the general audio wiki page.

Now you just have to connect the actions and create a PR :)

toby63 commented 4 years ago

I think the first variant is the best if the button will link to the general audio wiki page.

Well, I think it would not be good if it just links to the wiki page, it should contain some information on its own and can link to the wiki for additional information.

Now you just have to connect the actions and create a PR :)

I just wanted to show the possibility for now, but I will think about that. As I am no programmer this will take me forever and I got some other things to do right now.

This topic is not urgent, I just wanted to show that it does not need so many changes.

Krzmbrzl commented 4 years ago

As I am no programmer this will take me forever and I got some other things to do right now.

The changes should be really simple (depending on what you have in mind). Might be a good opportunity to get into ;)

mumble-voip / mumble

[Feature Request] better description for audio filters #4127