Closed rphill299 closed 4 months ago
I've tested dewhitening and conclude it is probably worth pursuing.
I used a 3 minute audio clip with 2:40 seconds of silence, and 20 seconds of speech. The trimmed 20 seconds of speech is 9x smaller than the full 3 minutes.
The result is transcribing the trimmed 20 seconds sees about a 3x speedup compared to the 3 minute clip.
Need to update testing to include time needed to denoise
Updated testing proved fruitless, even when ignoring time needed to load file into pydub. Archive.zip
Test speedup with and without dewhitening on a clip with lots of silence