McCloudS / subgen

Autogenerate subtitles using OpenAI Whisper Model via Jellyfin, Plex, Emby, Tautulli, or Bazarr
MIT License
453 stars 45 forks source link

anyway to force language in case of wrong whisper autodetection? #47

Closed ndx1905-github closed 5 months ago

ndx1905-github commented 5 months ago

Hi

Thanks for the great work and very useful tool (I'm amazed that it works on my 2018 NAS)

Faster whisper sometimes picks the wrong language.. Is there a way to force the language?

Here is an example, it's French audio wrongly detected as English (score of .56)

image

I should pass the --language fr to faster-whisper. One way could probably be to add "FR" somewhere at the end of the file name right before the file extension? https://github.com/openai/whisper/discussions/529

McCloudS commented 5 months ago

There is a way, but I haven’t exposed it. I can look at that. Whisper only uses the first 30 seconds to detect audio. So if there is little audio or it’s mixed with English, it can guess wrong.

On Thu, Feb 8, 2024 at 2:51 AM ndx1905-github @.***> wrote:

Hi

Thanks for the great work and very useful tool (I'm amazed that it works on my 2018 NAS)

Faster whisper sometimes picks the wrong language.. Is there a way to force the language?

Here is an example, it's French audio wrongly detected as English (score of .56)

image.png (view on web) https://github.com/McCloudS/subgen/assets/64833823/d72e3b1d-010c-4529-962c-4be5b97a4acd

— Reply to this email directly, view it on GitHub https://github.com/McCloudS/subgen/issues/47, or unsubscribe https://github.com/notifications/unsubscribe-auth/APJACQKEPES3KCPUOMMSXI3YSSN2FAVCNFSM6AAAAABC7NLD56VHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZDINZUHA3DANA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

McCloudS commented 5 months ago

Take a look at the readme @ FORCE_DETECTED_LANGUAGE_TO and try to re-pull the image. Let me know if it works!

ndx1905-github commented 5 months ago

Hey.. it didn't change anything. Here is the screenshot after adding the FORCE_DETECTED_LANGUAGE_TO variable and rebooting the docker.

image

However... I watched the video and subtitles still worked. I'm not sure what this "detected language" notification does in practice since the output was correct. Log says "English" detected, but in practice I still have the right "French to English" srt that is generated.

(EDIT: to be more precise, subtitles were correct even before the patch, and despite the wrong language detection)

McCloudS commented 5 months ago

Good to know that even with the misdetected language, it produces something passable. Did you actually pull the new docker image instead of just restarting it?

ndx1905-github commented 5 months ago

Yes I redownloaded the image. Just did it again, same result. Still detects English. But does the --language option change the language that is detected, or does it still detect English, writes that in the console and then the --language option forces faster-whisper to use FR even though EN was detected ?

McCloudS commented 5 months ago

It's passed the way that stable-ts and faster-whisper document. And per https://github.com/SYSTRAN/faster-whisper/blob/f144e4c83d54f3c3304b6a75a3f563e5f84de6cf/faster_whisper/transcribe.py#L344 it shows that if a language is manually defined it shouldn't try to detect the language unless it's an English only model. Any chance you're using the .en models? Like medium.en instead of medium.

ndx1905-github commented 5 months ago

yes I'm sure I've selected medium, not medium.en Also medium.en would probably produce garbage? but my output is still good

McCloudS commented 5 months ago

I'm a little puzzled as to why it's ignoring the language. I'll play with it later if I have time, but it's comforting to know that it may not actually matter?

McCloudS commented 5 months ago

This is now fixed and I confirmed it will show as the correct forced language. You can also navigate to http://subgenip:8090/docs in a browser and use the Batch option.