Dadangdut33 / Speech-Translate

A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface made using Tkinter. Code written fully in Python.
MIT License
436 stars 55 forks source link

Feature Request: Batch Uploading of audio files for processing. #4

Closed MRAWAY77 closed 1 year ago

MRAWAY77 commented 1 year ago

I would like to check if there is an option to batch upload a zip folders of multiple audio files for a single processing?

Dadangdut33 commented 1 year ago

multiple audio files for a single processing is now possible along with srt exports, thanks for the request submission :)

MRAWAY77 commented 1 year ago

Hi Thanks! I see that you have included a new feature in the exportation.... am i right to say you have included speaker diarization in the transcription as a form of enhancement? or is it originally from the https://github.com/openai/whisper repo that you reference from?

Dadangdut33 commented 1 year ago

Hi Thanks! I see that you have included a new feature in the exportation.... am i right to say you have included speaker diarization in the transcription as a form of enhancement? or is it originally from the https://github.com/openai/whisper repo that you reference from?

I think it's not speaker diarization, just timeframe of each segment because it is actually based from audio segmentation from whisper result itself and i actually made a wrong reference there it should be from this repo actually. I found that method in a discussion (which is in the whisper official repository)