Open madroidmaq opened 1 year ago
Hey @madroidmaq! Distil-Whisper runs entirely in the 🤗 Transformers library, where we assume a minimal set-up of audio input
-> text output
. In this regard, the text output
is a pure Python string containing the inferred transcription. To convert this to another file format, we could apply some additional post-processing steps. We could certainly add some code examples showing how to do this. It shouldn't be too hard to port the OpenAI Whisper post-processing to standalone Python examples: https://github.com/openai/whisper/blob/b38a1f20f4b23f3f3099af2c3e0ca95627276ddf/whisper/utils.py#L188
whisper supports outputting multiple file formats, currently
txt,vtt,srt,tsv,json,all
, does the project have the ability to output multiple formats? Here is the whisper command line help, you can specify the output format by--output_format
.