EtienneAb3d / WhisperHallu

Experimental code: sound file preprocessing to optimize Whisper transcriptions without hallucinated texts
264 stars 22 forks source link

SRT and Translation options #7

Closed orophix closed 1 year ago

orophix commented 1 year ago

Can srt subtitles and translation be added?

EtienneAb3d commented 1 year ago

@orophix,

SRT are already included using the "addSRT=True" input option.

You may translate using a different output language "lng=XX" than the " lngInput" parameter. Event is it works in many cases, this is not supposed to be a well supported feature of Whisper for other target languages than English.

orophix commented 1 year ago

Thanks for the response! So can I just add addSRT=True lng=JA into the command line as arguments? Also while I have you can I ask how this part works?

##### Need to be adapted for each language.
##### For prompt examples, see transcribeHallu.py getPrompt(lng:str)
lng="en"
prompt= "Whisper, Ok. "\
    +"A pertinent sentence for your purpose in your language. "\
    +"Ok, Whisper. Whisper, Ok. "\
    +"Ok, Whisper. Whisper, Ok. "\
    +"Please find here, an unlikely ordinary sentence. "\
    +"This is to avoid a repetition to be deleted. "\
    +"Ok, Whisper. "
EtienneAb3d commented 1 year ago

@orophix,

Currently, there is no command line analyse. You have to set these values in the code provided as an example on the main ReadMe page. https://github.com/EtienneAb3d/WhisperHallu/tree/main#code

The prompt value is taken by Whisper as an information of what could have been said before the file you want to transcribe. It will then ensure a higher probability to produce similar words in the coming transcription. This has 2 goals here:

  1. have a better recognition of the "Whisper, Ok." and "Ok, Whisper. " markers (if available and used) in your language. Thus, you need to translate them if they are not written like this in your language.
  2. provide with a pertinent example of text according to what will be said in the file to transcribe (vocabulary, expressions, phrases, etc).

You will find some translated examples here: https://github.com/EtienneAb3d/WhisperHallu/blob/319dba323b30adda3b227da2d122e5263fd19e73/transcribeHallu.py#L120

The prompt value should not be too long. I think, less than 100 or 200 words.

At the end of it, you should let a few words that surely won't be the start of the file to be transcribe, otherwise, Whisper may consider it's a repetition to be deleted, and you will lack the beginning of your transcription.