Closed YiLee01 closed 5 months ago
Prompting is a little tricky with whisper, much different from LLMs, because the prompt is just what the model is supposed to assume is the transcription from the previous window. Here is a more in depth guide: https://cookbook.openai.com/examples/whisper_prompting_guide
To put it simply, you're usually best off giving a really good example of text that you want it to output as the response, and the model will try to follow that format. See more here in these unit tests: https://github.com/argmaxinc/WhisperKit/blob/5572cd63c763c82c973077659c34a20e90d2afed/Tests/WhisperKitTests/UnitTests.swift#L681-L710
Thank you for your help, you solved my problem!
I made changes in the prefillDecoderInputs method for debugging, but found that it's not working. Here is a snippet of my code, is there something wrong with it?
let promptTokens = tokenizer.encode(text: "以下是普通话的句子,请以简体中文输出")