fixie-ai / ai-jsx

The AI Application Framework for Javascript
https://docs.ai-jsx.com
MIT License
1.04k stars 79 forks source link

Noise cancellation on top of Voice ASR #488

Open anunayajoshi opened 9 months ago

anunayajoshi commented 9 months ago

Love what @juberti is building with fixie voice, and the open-source benchmarking sites for ASR and TTS. Something I personally want to test is how noise cancellation algorithms like Picovoice's Koala affects these ASR's transcription, and I think that benchmarking site would be a good place for me to contribute and test that out.

Just wanted to check with @juberti if that's a welcome addition to the site for me to contribute? Not sure how exactly you'd imagine the layout, but I'd like to compare the transcriptions both with and without the noise cancellation going in. If you'd want it as a separate page or a checkbox on top of the ASR to enable noise cancellation? Happy to discuss!

juberti commented 9 months ago

Sure, that would be an interesting thing to try. Note that the browser is already performing some ANR (via either WebRTC or built-in device NR as on iPhone) so the effects of an additional NR stage may be modest.

juberti commented 9 months ago

One area where additional research could be quite useful is on VAD algorithms. Currently we use the WebRTC VAD in the ASR component, but it's pretty old and quite prone to false positives. I bet we would get much better perf from something like Silero VAD.

juberti commented 9 months ago

Regardless, please feel free to contribute any interesting improvements!