Bklieger / ScribeWizard

ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3
https://wizard.benjamin.sh
MIT License
459 stars 101 forks source link

App should automatically split audio files >25 MB and transcribe each part #8

Open Bklieger opened 5 months ago

Bklieger commented 5 months ago

Currently, the app can only handle audio files that are less than 25 MB. This is a limitation due to Whisper's max input file size of 25 MB. However, we can get around this limitation by splitting audio files greater than 25 MB into several files which can each be transcribed by the API. Then, the results can be combined into one transcript.

It should be noted that I believe there still needs to be an upper limit on the file size to preserve Whisper API cost. In addition, if the transcript becomes too large (# of tokens), then Groq API rate limits may cause errors on Groq API calls. There should be a check on this size as well.

Bklieger commented 4 months ago

@mentatbot

Currently, the app can only handle audio files that are less than 25 MB. This is a limitation due to Whisper's max input file size of 25 MB. However, we can get around this limitation by splitting audio files greater than 25 MB into several files which can each be transcribed by the API. Then, the results can be combined into one transcript.

It should be noted that I believe there still needs to be an upper limit on the file size to preserve Whisper API cost. In addition, if the transcript becomes too large (# of tokens), then Groq API rate limits may cause errors on Groq API calls. There should be a check on this size as well.

mentatbot[bot] commented 4 months ago

I will start working on this issue