Open fernandog opened 8 years ago
Too low for what? If the video doesn't carry the release group, it'll be the highest non-hash match for that video, no need to fake the hash match. If you're not happy with the score computation you can implement your own, it's as simple as overriding a function for library users.
too low for "perfect matches". the min_score we use is :
if sickbeard.SUBTITLES_PERFECT_MATCH:
return episode_scores['hash'] - (episode_scores['resolution'] +
episode_scores['video_codec'] +
episode_scores['audio_codec'])
its just a suggestion. maybe can be improved in other way. its not faking. the subtitle name has the exact same name as filename
What you can do is scale down the hash to the maximum score possible (sum of scores from filename tokens). However, in case of really poor filename, hash will be better but with this it won't.
I was taking a look at this question once more... and I realized one thing: If you decide to build a custom score computation then you don't have direct access to guess dict which is used in the actual logic:
https://github.com/Diaoul/subliminal/blob/master/subliminal/subtitle.py#L184
def guess_matches(video, guess, partial=False):
You only have access to the subtitle and video: https://github.com/Diaoul/subliminal/blob/master/subliminal/score.py#L66
def compute_score(subtitle, video, hearing_impaired=None):
Of course we can wrap the subtitle and intercept the guess_matches function and then implement any logic with it, but it's an extra layer and an extra step.
guess_matches
is a utility function, you can chose to reuse it or not with your logic. What compute_score
does is extract a guess from the video.name
and call various utility functions on that guess dict. You can choose to use guessit for that or build a guess dict yourself or don't use a guess at all. You're free to do whatever you want with the given subtitle for the given video.
The video file is not an issue, but the subtitle. Subtitles can be of several types: LegendasTvSubtitle, OpenSubtitles, AddictedSubtitles, etc. Each one of them has different fields and some of them uses a filename and/or release name to create a guessit dict.
If I want to keep everything the same and only customise the release group scoring computation, then I'll need to create a new scoring function and handle every single subtitle type again. That's possible, as you said. I'm just wondering if it could be simplified and have more granular extension points for the score computation.
Just to have some concrete examples:
For the given video file without a release_group: Castle.S08E22.Crossfire.1080p.WEB-DL.DD5.1.H.264.mkv
This is the sorted (score ASC) list of subtitles from legendastv
score = 344 for castle.s08e22.crossfire.1080p.web.dl.dd5.1.h-ddltv.srt
score = 344 for castle.s08e22.1080p.web-dl.dd5.1.h264-btn.srt
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.h.264.srt
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.h.264-hkd.srt
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.hevc.x265-rmteam.srt
score = 342 for castle.s08e22.720p.web-dl.dd5.1.h264-btn.srt
score = 342 for castle.s08e22.crossfire.720p.web-dl.dd5.1.h.264.srt
score = 339 for castle.s08e22.web-dl.x264-rarbg.srt
score = 339 for castle.s08e22.crossfire.480p.web-dl.x264-rmteam.srt
score = 339 for castle.s08e22.crossfire.720p.web-dl.hevc.x265-rmteam.srt
score = 332 for castle.2009.s08e22.hdtv.x264-lol.srt
score = 332 for castle.2009.s08e22.hdtv.xvid-afg.srt
score = 332 for castle.2009.s08e22.720p.hdtv.x264-dimension.srt
Since there's no release_group in the video file, there's no release group match and 5 subtitles scored the same: 344. The chosen subtitle is castle.s08e22.crossfire.1080p.web.dl.dd5.1.h-ddltv.srt
which has release group: ddltv
and has the same score of 4 other subtitles.
The best subtitle match would be the 3rd in this list: castle.s08e22.crossfire.1080p.web-dl.dd5.1.h.264.srt
, which has no release_group.
You're right, you cannot tweak just that part which is a get_matches
from the legendastv provider. https://github.com/Diaoul/subliminal/blob/master/subliminal/subtitle.py#L147
You can still use get_matches
and overwrite the specific part about release_group for legendastv subtitles in your compute_score
.
I don't see how to expose that in any other clean way.
If property matches could be refactored to accept a dictionary of functions, where the key is the property name and the value is a function that accepts two values and return True/False, then subliminal could have a default function list to compute matches for each property, and through API it could allow a different dict to be used:
Something like this: https://github.com/Diaoul/subliminal/blob/master/subliminal/subtitle.py#L199-L244
if match_functions['series'](video.series, guess.get('title')):
matches.add('series')
if match_functions['title'](video.title, guess.get('episode_title')):
matches.add('title')
if match_functions['season'](video.season, guess.get('season')):
matches.add('season')
if match_functions['episode'](video.episode, guess.get('episode')):
matches.add('episode')
if match_functions['year'](video.year, guess.get('year')):
matches.add('year')
if match_functions['release_group'](video.release_group, guess.get('release_group')):
matches.add('release_group')
That way it would be possible to tweak just part of the matching.
Another way to handle https://github.com/Diaoul/subliminal/issues/652#issuecomment-233577530 is to ensure that the subtitles are always sorted by:
For the case where there's no release_group and several subtitles score the same, the one with no release group will be first in the list:
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.h.264.srt (no release group)
score = 344 for castle.s08e22.1080p.web-dl.dd5.1.h264-btn.srt (btn)
score = 344 for castle.s08e22.crossfire.1080p.web.dl.dd5.1.h-ddltv.srt (ddltv)
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.h.264-hkd.srt (hkd)
score = 344 for castle.s08e22.crossfire.1080p.web-dl.dd5.1.hevc.x265-rmteam.srt (rmteam)
So, if we scale down the max score because of the absence of a property, we can ensure that the first subtitle in the list is most likely the one we're searching for.
This can be fixed with a release match: #572
As there is no 'release group' in both filename and subtitle it doesn't match release group and score get two low
Maybe in LTV when subtitle name == filename we make return a score 215 (hash) ? or another solution?
Release: Castle.S08E18.Backstabber.1080p.WEB-DL.DD5.1.H.264.mkv (no release group)
@ratoaq2
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.