fedirz / faster-whisper-server

https://hub.docker.com/r/fedirz/faster-whisper-server
MIT License
456 stars 69 forks source link

Support vtt and srt response types #16

Closed fedirz closed 2 months ago

fedirz commented 4 months ago

https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-response_format

cichy3000 commented 3 months ago

This would be great, if the model would support str/vtt 🤞 --- edit didn't noticed segments, so with some additional code processing we may build up our own file format. image

fedirz commented 3 months ago

Yep, you should be able to do that.

May I ask what's your use case for VTT/SRT response types, genuinely curios since I haven't these formats before besides in the OpenAI's API reference

Ki-er commented 2 months ago

May I ask what's your use case for VTT/SRT response types

VTT is the file that zoom audio transcripts come in, I see them during my interviews on zoom. This is a text file though not an audio file format as far as I know. I am not familiar with SRT.

csernikmarton commented 2 months ago

Hi @fedirz, when is this feature expected to become available in a tagged version and then in the Docker image? Thanks!

fedirz commented 1 month ago

Hey, I know it's pretty late, but I just pushed a new tag.