morpheus65535 / bazarr

Bazarr is a companion application to Sonarr and Radarr. It manages and downloads subtitles based on your requirements. You define your preferences by TV show or movie and Bazarr takes care of everything for you.
https://www.bazarr.media
GNU General Public License v3.0
2.86k stars 223 forks source link

Whisper fails on some files, but works fine on others #2474

Closed kubiokyay135 closed 6 months ago

kubiokyay135 commented 6 months ago

Describe the bug When searching for subtitles in Bazarr and having the Whisper provider set up, Whisper will repeatedly fail to provide the subtitles for specific files and Bazarr will then throttle Whisper as a subtitle provider

To Reproduce Steps to reproduce the behavior:

  1. Configure Whisper as a provider
  2. Click on search on an affected TV Show or Movie (This only affects specific files, other files work fine)
  3. After 30 seconds to a minute the Whisper provider will be throttled, and the subtitle is not generated

Expected behavior When clicking on search for subtitles for a specific movie or show, it should communicate with Whisper properly and generate subtitles

Software (please complete the following information):

Additional context This is my first ever bug report so I apologize if there's anything I've done incorrectly. I read through all related issues and I found one semi-similar, but it was closed without a fix as the original submitter never responded. Also searched Reddit/Internet and did a fair amount of troubleshooting with no fix.

McCloudS commented 6 months ago

Your language in the file shows as 'an' or Aragonese (from FFProbe), which is not supported by Whisper, which is why you're getting an error. I imagine that's wrong, but it's the source of truth feeding into the Whisper Provider. Probably needs to be an update on the Whisper Provider side to gracefully handle it.

@ayancey

kubiokyay135 commented 6 months ago

Ah I see, that makes sense. I can confirm that the file is indeed English and not Aragonese, so Whisper must be recognizing it wrong.

I've done a little further troubleshooting related to the settings of the Whisper container including changing the detection interval up to 60 seconds from the default 30, and attempting to force detection as English.

After making these changes I'm still unable to generate subtitles for the particular file in the logs I attached, but I was able to generate them for a few other files that failed prior. So it seemed to help, but not entirely fix the issue.

I'm not positive but I think when Bazarr passes the data to Whisper to transcribe, Whisper successfully takes into account the increased detection time, but ignores the forced language setting (this particular transcription still fails due to mis-recognizing the language).

Subgen (this particular whisper provider) does have the option to transcribe directly from a folder. I tested pointing it to my TV library and it looks like the forced language setting did work then, but there wasn't any way for me to force transcription of this specific file and it sort of started transcribing whatever it felt like. When I have a moment I'll see if I can setup a dedicated docker vol with this one particular file in it to see if it can transcribe directly with and without the forced language setting enabled.

Two additional things I've tried since the original report are:

1) Updating Subgen container to latest version (no difference)

2) Setting up another docker container with the official Whisper-asr-webservice and setting it as the Whisper provider in Bazarr (same exact Aragonese mis-detection issue, so the issue isn't Subgen specific)

One last thing to note, in all of these tests, the model in use is the "medium - FasterWhisper" model. Have not tried other models yet.

Thanks!

Editing to say that when I wrote the above, I didn't realize you were the dev of Subgen! That's really cool, and I love the app!

McCloudS commented 6 months ago

It isn’t a mis-detection issue from Whisper. Your file has the audio stream set as AN. You can find a program to modify the file to change it back to the appropriate language or download a new file that has the audio set correctly.

Even if the Bazarr Whisper Provider is updated to gracefully handle your error, your file still won't be able to have a subtitle generated because the FILE has the wrong language set.

hnorgaar commented 6 months ago

This subject have been up several times and there is no way to force Bazarr to use anything else than detection. There is a way to specify/change audio language in Sonarr which Bazarr see, however not take into account. Maybe make a request for that to be recognized and prioritized so its possible to gibe Bazarr a pointer

kubiokyay135 commented 6 months ago

This makes sense, and upon more careful inspection with ffprobe I can confirm that this particular file does have "an" named as the language even though it's English when played.

I'll see if I can change it to "en", and if I can't I'll just redownload a working file. I think we are good to close this as it's essentially a broken file. The other files I was originally having issues with were able to be transcribed after changing the detection interval to 60 seconds.

I'll consider putting in a feature request for a forced audio language in Bazarr if this is something that ends up coming up a lot.

Thanks!