Normalize / cleanup audio

jeremyandrews commented 4 years ago

Audio received from the Kakaia client is often too quiet for deepspeech to properly process, normalization could help a lot. However, often there's a loud sound at the very end of the clip (probably from the finger touching the screen to stop the recording) that would also first have to be detected and trimmed, as otherwise normalization doesn't do anything.

Need to select one or more libraries, ideally supporting multiple audio formats (at least FLAC and WAV). The plan is to handle realtime streams, so perhaps we can do chunked normalization (or is real-time normalization a thing?) which should also solve the loud sound at the end.

jeremyandrews commented 4 years ago

I requested help/ideas here: https://www.reddit.com/r/rust/comments/etc5sf/library_for_sound_normalization/

jeremyandrews commented 4 years ago

I started trying to get the rrnoise denoising library working, but was unable to get a basic example working (yet): https://github.com/RustAudio/rnnoise-c/pull/1

jeremyandrews commented 4 years ago

Here's a basic working example with rnnoise, although the end result is raw audio which I'm not sure yet how to use: https://github.com/RustAudio/rnnoise-c/pull/2

jeremyandrews / kakaia

Normalize / cleanup audio #5