wenet-e2e / wesignal

Production first, nn-based on-device signal processing toolkit.
Apache License 2.0
63 stars 3 forks source link

hope to see the support of Howling Suppression #4

Open zuowanbushiwo opened 1 year ago

zuowanbushiwo commented 1 year ago

thanks for such a great work, Howling suppression is very important in a video conferencing system, and it should be relatively difficult. At present, there are few open source acoustic howling suppression (AHS). I hope WeNet can provide a better implementation, thanks!

StuartIanNaylor commented 1 year ago

Usually its called acoustic ecco cancellation (AEC) and yeah that is listed.

robin1001 commented 1 year ago

Usually its called acoustic ecco cancellation (AEC) and yeah that is listed.

Yes!

zuowanbushiwo commented 1 year ago

Thanks , I understand they are not the same, I didn't see one paper note that the model using AEC can eliminate the howling. But for me, it is enough that wesignal can develop the function of dealing with howling, and I have been paying attention

It is worth noting that acoustic howling is different from acoustic echo even though inappropriately handled acoustic echo (leakage) could also result in howling. The major differences between them are: 1. Both of them are essentially playback signals while howling is generated gradually. 2. The playback signal that leads to howling is generated from the same source as that of the target signal. While acoustic echo is usually generated from a different source (far-endspeaker), which makes the suppression of howling more challenging.

Deep AHS: A Deep Learning Approach to Acoustic Howling Suppression Deep Learning for Joint Acoustic Echo and Acoustic Howling Suppression in Hybrid Meetings

StuartIanNaylor commented 1 year ago

Would seem only difference to AEC is that Howling is AEC with a contant feeback loop like on a telephone. With speech regonition you don't have a constant feeback loop as the mic input is processed by ASR and is not transmitted again. AEC will cancel what might be playing from TTS or a skill for music to cancel as that is known not to be voice. On telephone systems it used to be called line cancellation which also is just AEC with a vert short or no tail.

Howling from I can tell is what I would call feedback or a feedback loop that doesn't exist in a Speech recognition system as the mic is not looped to a speaker output but fed to ASR.

zuowanbushiwo commented 1 year ago

@robin1001 what is the status of this project now, is it stopped?

robin1001 commented 1 year ago

It's still on developing, however the progress is slow.

zuowanbushiwo commented 1 year ago

@robin1001 Thanks, That's great news, is there a timeline for when the first version will be released?

robin1001 commented 1 year ago

Sorry, there is no timeline.

zuowanbushiwo commented 1 year ago

Ok, got it, thanks!

zuowanbushiwo commented 4 months ago

@robin1001 What is the status of this project now, is it stopped? thanks!