Open taorui-plus opened 1 month ago
@yuekaizhang
I found the difference between faster-whisper and triton whisper prompts is:
faster-whisper:<|transcribe|>{prompt_text}<|startoftranscript|><|startofprev|><|startoflm|>
Triton whisper: <|startoftranscript|> <|{language}|> <|transcribe|> <|notimestamps|>
faster-whisper uses <|startofprev|><|startoflm|>
token instead of <|notimestamps|>
token,
This openAI discussion explains that <|startoflm|>
token is not used in openAI's API.
Therefore, triton whisper is the same as openAI's API,This solution also has better recognition performance.
I completed the concurrency test based on tensor RT + triton server deployment, and the concurrency was about doubled compared to faster-whisper.
I am testing its accuracy, but the Chinese transcription always shows 繁体字. I want to solve this problem by adding hot words in the prompt, but I encounter some problems.
I tried both of these methods with no success:
<|startofpref|>{prompt}<|startoftranscript|><|zh|><|transcribe|><|notimestamps|>
<|startoftranscript|><|zh|><|transcribe|><|notimestamps|>{prompt}<|endoftext|>
Here is some information I referenced:
The whisper documentation only gives this way of writing:
<|startoftranscript|> <|en|> <|transcribe|> <|notimestamps|>
The readme document of triton/whisper shows: