Const-me / Whisper

High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model
Mozilla Public License 2.0
7.65k stars 664 forks source link

hope support arguments :--initial_prompt and batch... #207

Open lforlgg opened 3 months ago

lforlgg commented 3 months ago

Const-me Whisper's GUI runs very efficiently. Love! hope further:

  1. The options are all on the one panel, so there is no need to move forward or backward step by step. At present, the options are not complicated.
  2. It can be batch convert multiple files by list.
  3. Transcribe(original language subtiles) and translate(to english subtiles),can be generated at one time.
  4. Insanely Faster with OpenAI's Whisper Large v3 has released.
  5. Support arguments :--initial_prompt https://platform.openai.com/docs/guides/speech-to-text/prompting

many tks to Kosta.

For your reference: prompt Whisper_arguments2 Whisper_arguments3

lforlgg commented 3 months ago

btw,openai/whisper, November 2023 released large-v3 Insanely Fast Whisper Transcribe 150 minutes (2.5 hours) of audio in less than 98 seconds - with OpenAI's Whisper Large v3. The large-v3 model shows improved performance over a wide variety of languages, showing 10% to 20% reduction of errors compared to Whisper large-v2. Transformers uses a chunked algorithm to transcribe long-form audio files, which in-practice is 9x faster than the sequential algorithm proposed by OpenAI https://github.com/chenxwh/insanely-fast-whisper

RickArcher108 commented 3 months ago

You say, "It can be batch convert multiple files by list". How does one do that?