Closed 777sfdf closed 9 months ago
Hello, I'm not sure neither I tested if this works, but based on the original whisper discussion, prompting might help in this scenario: https://github.com/openai/whisper/discussions/277
If you can test it, and update here if that works, it would be great.
Thank you very much for your answer. I'm sorry for the delay in replying to you. I have already read this issue, but due to my lack of proficiency in. net, what puzzled me was how to use this command in a project - initial_ Prompt, so if you could have time to revisit this question and provide an answer, I would greatly appreciate it. Thank you
In order to use the initial prompt, you will call WithPrompt
method on the whisperBuilder:
https://github.com/sandrohanea/whisper.net/blob/454ad43043e3b5cd920e5e3a1cb309861c21d158/Whisper.net/WhisperProcessorBuilder.cs#L286C36-L286C46
Okay, thank you again for your prompt response. I will provide feedback on the results after the test
Hey @777sfdf ,
Any news about the prompt effectiveness for your use-case?
I'm very sorry for not replying to your reply in a timely manner. The main reason is that during the testing a while ago, the results were not very good. Today, I conducted the testing again and finally achieved good results. The audio escaped text is already Simplified Chinese
The following code has been added with only one additional method, WithPrompt. The rest, including the model and audio, have not changed
The rendering is as follows
In addition, I would also like to inquire about the accuracy ranking of the four models included in the ggml model. Is it in the order of ting<base<small<medium? I hope to receive your reply. Thank you
Awesome, I'm glat to here that "WithPrompt" is working as expected in the given scenario.
Also, as a suggestion, if you already know the language, instead of auto
you can use zh
so that detection of the language won't take place and you'll get transcripts faster.
About the model sizes, they are: tiny < base < small < medium < large
Testing on the Windows platform that the text (subtitles) transcribed using Chinese audio are in Traditional Chinese. If you want to output Simplified Chinese, how should you solve it? If you have the time to reply and answer my questions, I would greatly appreciate it. Thanks for you!!!
The core code is as follows: var segments = new List();
var encoderBegins = new List();
string ModelFilePath = "ggml-base.bin"; string txtFilePath = "mqfwn-f344o.wav";