Audio preprocessing - Githubissues

predestination commented 5 years ago

Hey, what are the possible Audio Pre-processing steps that can be used to improve transcript quality? Is there any library in python for denoising or audio enhancement without using deep learning ( as it is taking lot of time for a small audio clip). ?

tonanhngo commented 5 years ago

Hi, if you expect most of your input is noisy or is unique in certain ways (like speaker accent, background noise), then it's better to train the custom acoustic model with this type of audio. The IBM Debater uses this approach and was able to reduce the error rate to ~5%. If you have a few audio clips and want to do noise reduction, I did a quick search and saw a few options:

https://pypi.org/project/noisereduce/
https://pypi.org/project/logmmse/
https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html But it appears you would need to have the right reference noise audio to process against.

predestination commented 5 years ago

Thank you for the reply, I tried noisereduce and logmmse earllier but it didn't improve the transcript quality. Will check the scipy signal.

IBM / Train-Custom-Speech-Model

Audio preprocessing #76