Bklieger / ScribeWizard

ScribeWizard: Generate organized notes from audio using Groq, Whisper, and Llama3
https://wizard.benjamin.sh
MIT License
459 stars 101 forks source link

Add support for splitting and transcribing large audio files #20

Closed mentatbot[bot] closed 4 months ago

mentatbot[bot] commented 4 months ago

This update addresses the issue where the app could not handle audio files larger than 25 MB due to Whisper's max input file size limitation. The solution involves splitting audio files greater than 25 MB into smaller chunks, each of which can be transcribed by the API. The results are then combined into a single transcript.

Changes include:

This ensures that larger audio files can be processed without hitting the size limit, while also considering API cost and rate limits.

Closes #8

Bklieger commented 4 months ago

@mentatbot Please review this PR. Running the code results in this error when uploading an .m4a file which previously worked: Error code: 400 - {'error': {'message': 'file must be one of the following types: [flac mp3 mp4 mpeg mpga m4a ogg opus wav webm]', 'type': 'invalid_request_error'}}