English language output is default, with no indication on how to change it

JuliaNeuralGraphics / Whisper.jl

MIT License

19 stars 1 forks source link

English language output is default, with no indication on how to change it #5

Open FredrikKarlssonSpeech opened 2 weeks ago

FredrikKarlssonSpeech commented 2 weeks ago

The transcription system works and provides an output that is content-wise fine, but the language is English always. Unless I am missing some way of forcing the output language, I suspect that I am not alone in assuming that if I give this command

Whisper.transcribe(
           "/Users/frkkan96/Desktop/kaa_yw_pb_16000.flac", "/Users/frkkan96/Desktop/kaa_yw_pb_16000.srt";
           model_name="large-v3", language="swedish", dev=cpu, precision=f32)

I would get an .srt file with the Swedish language transcription in it?

pxl-th commented 2 weeks ago

Yes, by default the model performs transcribing to English and if the audio is non-English, then it also translates to english.

To actually transcribe to non-English language you have to specify language keyword argument.

To do this automatically, we'd have to run language detection first, before transcribing.

FredrikKarlssonSpeech commented 2 weeks ago

To actually transcribe to non-English language you have to specify language keyword argument.

Sure, and I did (see the post). So, when you want to transcribe, in my case Swedish, the usual case is also to want the output in Swedish language text. Not English language text in the SRT - which is what I got.