Gadersd / whisper-burn

A Rust implementation of OpenAI's Whisper model using the burn framework
MIT License
268 stars 33 forks source link

No output with 8 second clip? #4

Closed n8henrie closed 1 year ago

n8henrie commented 1 year ago

This project fails to provide any output on a different test file (the test file works with whisper, sounds normal when I listen to it, and was created from an m4a via ffmpeg according to the whisper instructions):

$ ffmpeg -i ../whisper/20220922\ 084923.m4a -ar 16000 -ac 1 -c:a pcm_s16le output.wav
$ cargo run --release output.wav tiny_en
    Running `target/release/whisper output.wav tiny_en`
<|notimestamps|>
Transcribed text: <|notimestamps|>

GitHub refuses to allow me to upload a wav file (even base64 encoded as .txt). Not sure what the best way to share is.

Gadersd commented 1 year ago

Check the bit depth of your wav file. The code currently assumes a 16 bits bit depth.

Gadersd commented 1 year ago

I just added support for other bit depths. It should hopefully work now.

n8henrie commented 1 year ago

Works! Thank you, and sorry for the long response time.