breizhn / DTLN

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
MIT License
585 stars 161 forks source link

Google research #41

Closed StuartIanNaylor closed 3 years ago

StuartIanNaylor commented 3 years ago

https://github.com/google-research/google-research/tree/master/kws_streaming

In there models they have embedded MFCC so you just point the 16k audio stream in chunks and the streaming KWS works.

There is FFT in the python ops and just wondered with the TFlite models and a look at what they did above could improve performance. This is all beyond me but after some testing the embedded MFCC seems approx 2x faster than an external routine with Librosa.

Dunno if the above is any you to you. @breizhn