morpheus65535 / bazarr

Bazarr is a companion application to Sonarr and Radarr. It manages and downloads subtitles based on your requirements. You define your preferences by TV show or movie and Bazarr takes care of everything for you.
https://www.bazarr.media
GNU General Public License v3.0
2.78k stars 218 forks source link

Bazarr marks manually uploaded sub as HI even though it is not selected #1776

Closed Ruvonn closed 2 years ago

Ruvonn commented 2 years ago

Describe the bug pretty much the title; I select a movie (Some like it hot 1959) and press on 'upload', select my file and then select english, no HI, no forced, press 'upload' and voila, for some reason Bazarr tags it with 'English HI'

To Reproduce

  1. have 'Some like it hot' in your library, because it's an awesome film
  2. Click on it in the Bazarr list
  3. Click on 'upload' in top right
  4. Select subtitle file
  5. Select just english and press 'upload'
  6. ...
  7. profit

Expected behavior it should be tagged english

Screenshots can do, but I don't think it's necessary

Software (please complete the following information):

Additional context checking through the file it doesn't have anything special, just some \ italic \ tags and some ♪ lyrics ♪, nowhere any [...] which would be used for closed captions I think? attached file anyway 3_English.txt

morpheus65535 commented 2 years ago

When indexing subtitles, Bazarr parse the content of those not already tagged as hearing-impaired in filename (.hi.srt, .cc.srt, .sdh.srt). For that purpose, we use this regex: [*¶♫♪].{3,}[*¶♫♪]|[\[\(\{].{3,}[\]\)\}](?<!{\\an\d}). In your specific case, there 77 matches mostly for music lyrics(). This isn't a bug, this is an HI subtitles file and it must be indexed properly.

Ruvonn commented 2 years ago

well, I will disagree to it being a HI sub, because it's not. they sing in the movie, therefore there is subs for the songs since they are relevant to the story...there is no actually HI text like descriptions of tone of voice, ambiance etc. but thank you for explaining how the parsing is implemented! I would suggest that actual, manual user input would override the parser, though, maybe you could consider that at some point (:

morpheus65535 commented 2 years ago

Although I respect your right to disagree, I would suggest some reading regarding HI/CC/SDH subtitles: https://en.wikipedia.org/wiki/Subtitles#Closed_captions https://en.wikipedia.org/wiki/Closed_captioning#Syntax

Both link clearly state that songs lyrics are subtitles.

As for your suggestion, this would require a new process of remembring user input for specific subtitles. Actually, we simply replace the index subtitles when required without taking previous values into account. I don't see this in an upcoming future.

Ruvonn commented 2 years ago

thank you for providing the links (: I understand your point, and also from the standpoint of implementing it I understand the need for generalisation and simplification.

However, I would be curious to hear your thoughts on this: suppose you are watching a musical. all that's happening is singing. so what's gonna be in a non-HI subtitlefile? nothing? well, that's a shitty UX if you ask me and it is sufficient for a HI person if the lyrics are provided? however no ambiance, no tone of voice, no descriptive text of auditory experience whatsoever? also pretty shitty UX imo.

I fully understand that lyrics will in most cases be for a song playing in the background and therefore precisely fall into this category of descriptive text that I was just talking about. however that's not always the case. when characters talk in a singing voice subtitlesfiles might already put ♪, but you couldn't just leave that part out of the normal subtitles and only put it for HI, that would make absolutely no sense.

In any way, I don't think people are gonna die because bazarr lables subtitles as HI when they're not, just thought I would give my input on the matter (: cheers, and have a nice rest of the week