KathyReid commented 6 years ago

NOTE: This issue supercedes Issue #57

Problem statement

The current audio bus on the Mark 1 and Picroft images does not eliminate the speaker audio from the microphone. This leads to undesirable device behavior, most noticeably when an audio stream is playing and the user is unable to “barge in” easily with a Hey Mycroft.

The device is aware of what audio is being output from the speaker. The essential idea desired is to subtract the speaker audio-out from the microphone audio-in using an appropriate approach - such as time-shifting the outbound audio and matching it to the audio in from the microphone.

Acceptance criteria

The solution must work on a Mark 1 reference hardware device. Picroft is OK for testing or proof of concept, but the solution must work in a Mark 1 enclosure acoustic environment
The solution must work with an audio stream that is being played at 3/4 volume, such as Pandora, Spotify, Mopidy or other streaming audio
The solution must work with the default Precise Wake Word detection software.
A user must be able to interrupt the audio input/output stream by speaking the Wake Word - ie ‘Hey Mycroft’ at normal volume (ie not shouting).
The solution must work within the CPU limitations of RPi 3 hardware (the hardware used for both Mark 1 and Picroft). Namely, not exceeding a 3.0 load average when running the top command.

Useful information

Key technical contact - Steve Penrod (@penrods) (@steve-mycroft at https://chat.mycroft.ai)

Bounty

The Bounty for this feature request is $USD1000, as well as a free Mark 1 and a Gold Mycroft Challenge Coin.

stephanelpaul commented 6 years ago

I'm going to take a look at this shortly

ekjswim commented 6 years ago

Info that may be helpful re: OSS DSP: http://www.audioxpress.com/news/the-linux-foundation-adopts-sound-open-firmware-project-enabling-developers-to-adapt-operating-systems-for-audio-devices

pcwii commented 6 years ago

More helpful information: PulseAudio supports module-echo-cancelation. More information here...https://arunraghavan.net/2016/05/improvements-to-pulseaudios-echo-cancellation/

el-tocino commented 6 years ago

Some hopefully useful links about the pulse module: https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#index45h3 https://wiki.archlinux.org/index.php/PulseAudio/Troubleshooting#Enable_Echo.2FNoise-Cancelation The echo cancellation module can also do beamforming...

pcwii commented 6 years ago

@KathyReid @penrods Has anyone explored this option (pulse audio echo cancelation) previously? I am willing to give it a go although I only have a picroft to work with.

forslund commented 6 years ago

I believe it was tried a couple of years ago but the cpu strain was quite high. (This is what I've heard so no personal experience on the Pi). The pulse audio echo cancellation works great on my workstation so it'd be cool if it could work on the Pi as well. If it's too intensive on the hardware maybe there are tweaks that can be made.

Give it a try, and see what the result is!

roadriverrail commented 6 years ago

I've worked on projects using a Broadcom chipset not unlike that of the BCM2837 (which is used in RPi3) and we'd seen good success using the Opus echo canceler. It does take CPU to do, but it wasn't particularly bad. Unfortunately, I don't have the necessary free time to contribute to the bounty hunt, but I thought perhaps suggesting this would help someone else.

KathyReid commented 6 years ago

Thanks for your feedback, @roadriverrail - great suggestion!

el-tocino commented 6 years ago

Potentially interesting: https://github.com/xiph/rnnoise and based on that: https://github.com/werman/noise-suppression-for-voice (the above are significantly slower than viable, alas: ~8:1 increase in processing)

tlc commented 6 years ago

@forslund, When working on a workstation with the mycroft source, does pulse echo cancellation get loaded automatically or do we have to do that ourselves?

Do USB speakerphone devices such as the Jabra 410 (popular in the forums) do echo cancellation? I'm using one with a RPi 3B+ and "Hey Mycroft, stop" seems to work. Although, I'm not sure if it works "well" at "normal volume".

el-tocino commented 6 years ago

Currently, no distros load the pulse echo cancellation (that I know of).
Per https://www.jabra.com/business/speakerphones/jabra-speak-series/jabra-speak-410 "Digital Signal Processing (DSP ) technology Crystal clear sound without echoes or or distorted sounds even at max volume level" which sounds a lot like it has some sort of echo canceling.

forslund commented 6 years ago

@tlc as @el-tocino states the echo cancellation isn't loaded by default. Loading it creates a virtual microphone that you need to set as default to use with mycroft. (basically selecting it in the pulse audio volume control)

KathyReid commented 6 years ago

How are we all going with this one - any questions? Any information we could provide to help?

j1nx commented 6 years ago

Not my work, but just ran into it;

https://github.com/voice-engine/ec

Looks interesting and ticking the boxes.

domcross commented 6 years ago

I have experimented with voice-engine/ec (which is basically a wrapper for speex) and PulseAudio's echo-cancel module (you have to install PA 7.1 from the Debian-Jessie-Backports for that) using algorithms "webrtc" and "speex" (adrian is not usable at all) but had no luck so far. I see mainly two reasons: 1) when music is played over the Mark-I speaker the mic of the Mark-I almost only picks up the music (this is caused of the physical construction), in addition the mic/preamp picks up a lot of electric/radio noise. This makes it really tough for any noise/echo-cancel algorithm. 2) The RPI3 timing of the internal clock is not stable enough for this kind of realtime processing - the permanent timedrift confuses the echo-cancel algorithms as well. I will give "rnnoise" a try shortly (have it already compiled for RPI but some problems configuring it for PA) but don't have to high exspectation for the above reasons

penrods commented 6 years ago

I'd be willing to consider a solution that requires a minor and cheap add-on or modification to the Mark 1, e.g. acoustic foam separating the mic and speaker or wire rerouting. But not board level changes.

el-tocino commented 6 years ago

Beamforming based on the mic position plus a cheapo usb mic might be an option. One or two of these mini mics (search "overfly portable usb 2.0 mic") set in the ports combined with the audio from the existing mic run through a beamformer should be able to do aec and improve listening. I haven't tried it myself yet, alas.

domcross commented 6 years ago

After some more experimenting I have a configuration with the PulseAudio echo-cancel module that works reasonably^* with volume levels up to 5 (Mark-1's maximum is 11) within a distance of approx. 4 feet. There is some more room for tweaking parameters that might increase reliability. I didn't try the hardware tweaking (acoustic foam) yet. In addition I am considering changes in Mycroft Audioservices, e.g. duck/mute music as soon as wake-word is detected in order to get a clean utterance...

^*depends on the music material, the more compressed (see "loudness war") the less reliable it works.

j1nx commented 6 years ago

I believe @forslund already did some work on the ducking part. Believe it is already in PR / Issue section somewhere.

With you that AEC has to be combined with audio ducking.

el-tocino commented 5 years ago

I used some door/window insulating foam (similar: https://www.homedepot.com/p/Frost-King-3-4-in-x-5-16-in-x-10-ft-Black-Rubber-Foam-Weatherseal-Tape-R534H/202262324) to make a barrier around the front of the mic between the face circuitboard and the faceplate. Secondarily to that, I covered the back of the speaker with foam as well.

forslund commented 2 months ago

Closing Issue since we're archiving the repo

MycroftAI / mycroft-core

Bounty: Implement noise cancellation on RPi-3 based hardware devices (Mark 1 and Picroft) #1478

Problem statement

Acceptance criteria

Useful information

Bounty