Closed ratoaq2 closed 9 years ago
The similarity
option in legendastv.ini
is used only to identify the correct movie title, as title search in Legendas.tv website is a free text search. So a similarity _threshold_ is needed to make sure we're not picking a completely unrelated movie: if not similar enough, we consider it's not the movie we are looking for, and _discard_ it.
Once we have the idfilme
from the chosen title, subtitles are fetched by this ID, so all subtitles in the list are _known_ to be for this title. The similarity
option is no longer used (nor needed) from now on.
Sure, we also use a similarity _rating_ (not a threshold) when ordering the subtitles list, to find the subtitle release that best matches the filename. But no need for a threshold: subtitles with poor similarity tend to have a poor overall ranking. Similarity is not the only criteria for ranking, but it has a high weight. Actually similarity alone accounts for half the score!
Remember that all candidates are guaranteed to be for your movie title (per ID), and wrong episodes were already discarded. And the chosen subtitle is _the best_ candidate among all other.
Can it still have a poor similarity and be for the wrong release? Sure. But it's still better than the others. I's the best subtitle legendas.tv website had to offer for that movie.
Similarity index needs context: when searching for a movie, we have a reference string to compare to, usually from OSDB. We want movie XYZ, similarity is our only criteria, and no other title would suit us. In that context, it makes perfect sense to set an absolute threshold to avoid picking the wrong movie.
For subtitles, context is completely different: similarity is compared against the other candidates, and the best wins. Their absolute index is irrelevant. A subtitle registered as "Movie.XYZ.2004.all bluray releases from DIMENSION/YIFI/FNQ/PQP/VTNC. com ressync do INSUBS" will have a very poor similarity but may be exactly the one you want.
Even if it was selected not because it was a great match but because it was the only subtitle found for that title, there's nothing we can do. Better to have a subtitle for that movie than having none. (but the same does not apply for movie titles)
Last but not the least, subtitle release strings are a very loose format. There is no standard, as opposed to movie title that has an "official" name. We have no way to tell if a poorly similar subtitle is a good fit or not.
By the way, 0.94
looks like an extremely high threshold. Aren't you getting too many (valid) titles discarded because of that? Is the built-in filename parser able to extract titles with that level of similarity from your files? I'm impressed! Either your filenames have a very standard format (did you manually rename them?), or my humble parser is better than I thought. Also, it indicates the titles in Legendas.TV database are more trustworthy (and matching OSDB's) than I expected. Good :)
Now back your your particular issue:
rankSubtitles()
weights could require some tuning. Current values are somewhat arbitrary, and I need feedback to improve it.srt
s the chosen archive has? Can you paste the extraction log for me to analyze why that file was chosen? Also, which one you think should have been chosen?2015-02-28 15:53:00,321 DEBUG Target: /home/osmc/projetos/legendastv/Nothing But The Truth 2008 720p BluRay x264-ARiGOLD
2015-02-28 15:53:00,322 DEBUG Guessed title info: 'Nothing But The Truth 2008 720p BluRay x264-ARiGOLD' -> {'release': u'Nothing But The Truth 2008 720p BluRay x264 ARiGOLD', 'title': u'Nothing But The Truth', 'year': u'2008'}
2015-02-28 15:53:00,423 DEBUG OSDB.LogIn(u'', u'***', u'', u'Legendas.TV v1.0') -> {'status': '200 OK', 'seconds': 0.015, 'token': '3eor9lrgk6q327i5gohqvd8s45'}
2015-02-28 15:53:00,424 ERROR File '/home/osmc/projetos/legendastv/Nothing But The Truth 2008 720p BluRay x264-ARiGOLD' must be at least 65536 bytes
2015-02-28 15:53:00,424 DEBUG 0 OpenSubtitles titles found:
2015-02-28 15:53:00,452 NOTIFY Logging in Legendas.TV
2015-02-28 15:53:00,457 INFO Logging in http://legendas.tv/login as *****
2015-02-28 15:53:01,683 NOTIFY Searching titles for 'Nothing But The Truth'
2015-02-28 15:53:01,684 DEBUG loading /legenda/sugestao/Nothing+But+The+Truth
2015-02-28 15:53:01,957 DEBUG Titles found for 'Nothing But The Truth':
{'title_br': u'The 4400.S04E04.HDTV.XviD-BiA.The Truth and Nothing But the Truth', 'thumb': None, 'title': u'The Truth and Nothing But the Truth', 'season': u'4', 'imdb_id': u'1049219', 'year': u'2007', 'type': u'episode', 'id': u'12926'}
{'title_br': u'Nothing But the Truth', 'thumb': u'http://i.legendas.tv/poster/tt1073241.jpg', 'title': u'Nothing But the Truth', 'season': None, 'imdb_id': u'1073241', 'year': u'2008', 'type': u'movie', 'id': u'14841'}
2015-02-28 15:53:01,967 NOTIFY 2 titles found
2015-02-28 15:53:01,968 DEBUG Chosen best for 'Nothing But The Truth' in 'search': {'best': {'title_br': u'Nothing But the Truth', u'search': u'Nothing But the Truth', 'thumb': u'http://i.legendas.tv/poster/tt1073241.jpg', 'title': u'Nothing But the Truth', 'season': None, 'imdb_id': u'1073241', 'year': u'2008', 'type': u'movie', 'id': u'14841'}, 'similarity': 1.0}
2015-02-28 15:53:01,978 NOTIFY Searching subs for 'Nothing But the Truth'
2015-02-28 15:53:01,978 DEBUG loading /util/carrega_legendas_busca_filme/14841/1
2015-02-28 15:53:02,372 DEBUG Subtitles found for 14841:
{'rating': 10, 'hash': u'c7d50e660aafc1ddca1cf3b79bdfcca4', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 256, 'release': 'Faces.da.Verdade.Dual.ptbr.eng.DvdRip.Xvid.Ac3.Brazilinjapan.by.cinefila', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2010, 2, 25, 13, 19), 'highlight': False, 'user_name': 'cinefala', 'pack': False}
{'rating': 10, 'hash': u'13d7775dd3075a6c59fc0b52ba3b0aa1', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 389, 'release': 'Nothing.But.The.Truth.2008.BRRip.H264.AAC-SecretMyth.(Kingdom-Release)', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2010, 2, 12, 13, 10), 'highlight': False, 'user_name': 'gamobra', 'pack': False}
{'rating': 10, 'hash': u'46e2c9179ecc6e0fdd75912fd37b9814', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 388, 'release': 'Faces.Da.Verdade.DVDRip.Dual.XviD.MP3-ZAMENGO', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2010, 2, 3, 21, 50), 'highlight': False, 'user_name': 'jcbandeira', 'pack': False}
{'rating': 10, 'hash': u'1959971213897a145437ca7423670e67', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 975, 'release': 'Nothing.But.The.Truth.2008.LiMiTED.720p.BluRay.x264-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2010, 1, 20, 19, 43), 'highlight': False, 'user_name': 'acnBR', 'pack': False}
{'rating': 10, 'hash': u'b429ce5ebcea1c42755bd43fa6ed68ae', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1732, 'release': 'Nothing.But.The.Truth.LIMITED.DVDRip.XviD.AC3-DEViSE', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 4, 28, 23, 0), 'highlight': False, 'user_name': 'ampg4', 'pack': False}
{'rating': 10, 'hash': u'70fcfefd6b4ae9dba758f742d1744017', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 4082, 'release': 'Nothing.But.The.Truth.2008.DvdRip-FxM', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 4, 17, 12, 52), 'highlight': False, 'user_name': 'gunca', 'pack': False}
{'rating': 10, 'hash': u'50f437e78ba63b578f3f816eee520d5d', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1249, 'release': 'Nothing.But.The.Truth.LiMiTED.DVDRip.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 4, 17, 0, 5), 'highlight': False, 'user_name': 'daniellce', 'pack': False}
{'rating': 10, 'hash': u'4f1b1fe567978e6b55c4847732b69468', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 176, 'release': 'Nothing.But.The.Truth.2008.DVDRip.XVID.AC3-TST', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 4, 16, 22, 19), 'highlight': False, 'user_name': 'ricklaferla', 'pack': False}
{'rating': 10, 'hash': u'24739179e3148c0b4ce70f005d0906ab', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1033, 'release': 'Nothing.But.The.Truth.LiMiTED.DVDRip.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 4, 15, 14, 58), 'highlight': False, 'user_name': 'j708', 'pack': False}
{'rating': 10, 'hash': u'fe1a037d75cf46a3c27d82e3e0fe22d6', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 5431, 'release': 'Nothing.But.The.Truth.2008.DVDSCR.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', 'date': datetime.datetime(2009, 2, 26, 9, 49), 'highlight': True, 'user_name': 'alcobor', 'pack': False}
2015-02-28 15:53:02,383 NOTIFY 10 subtitles found
2015-02-28 15:53:02,386 DEBUG Ranked subtitles for {'title_br': u'Nothing But the Truth', u'search': u'Nothing But the Truth', u'episode': u'', 'thumb': u'http://i.legendas.tv/poster/tt1073241.jpg', 'title': u'Nothing But the Truth', u'season': None, u'filename': u'Nothing But The Truth 2008 720p BluRay x264-ARiGOLD', 'imdb_id': u'1073241', 'year': u'2008', 'release': u'Nothing But The Truth 2008 720p BluRay x264 ARiGOLD', u'dirname': u'legendastv', u'type': u'movie', 'id': u'14841'}:
{'rating': 10, 'hash': u'fe1a037d75cf46a3c27d82e3e0fe22d6', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 5431, 'release': 'Nothing.But.The.Truth.2008.DVDSCR.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 8.458762886597938, 'date': datetime.datetime(2009, 2, 26, 9, 49), 'highlight': True, 'user_name': 'alcobor', 'pack': False}
{'rating': 10, 'hash': u'1959971213897a145437ca7423670e67', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 975, 'release': 'Nothing.But.The.Truth.2008.LiMiTED.720p.BluRay.x264-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 8.270104895104895, 'date': datetime.datetime(2010, 1, 20, 19, 43), 'highlight': False, 'user_name': 'acnBR', 'pack': False}
{'rating': 10, 'hash': u'13d7775dd3075a6c59fc0b52ba3b0aa1', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 389, 'release': 'Nothing.But.The.Truth.2008.BRRip.H264.AAC-SecretMyth.(Kingdom-Release)', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 7.62079831932773, 'date': datetime.datetime(2010, 2, 12, 13, 10), 'highlight': False, 'user_name': 'gamobra', 'pack': False}
{'rating': 10, 'hash': u'70fcfefd6b4ae9dba758f742d1744017', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 4082, 'release': 'Nothing.But.The.Truth.2008.DvdRip-FxM', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 7.273226773226773, 'date': datetime.datetime(2009, 4, 17, 12, 52), 'highlight': False, 'user_name': 'gunca', 'pack': False}
{'rating': 10, 'hash': u'50f437e78ba63b578f3f816eee520d5d', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1249, 'release': 'Nothing.But.The.Truth.LiMiTED.DVDRip.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 7.268681318681319, 'date': datetime.datetime(2009, 4, 17, 0, 5), 'highlight': False, 'user_name': 'daniellce', 'pack': False}
{'rating': 10, 'hash': u'24739179e3148c0b4ce70f005d0906ab', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1033, 'release': 'Nothing.But.The.Truth.LiMiTED.DVDRip.XviD-ARiGOLD', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 7.265934065934066, 'date': datetime.datetime(2009, 4, 15, 14, 58), 'highlight': False, 'user_name': 'j708', 'pack': False}
{'rating': 10, 'hash': u'4f1b1fe567978e6b55c4847732b69468', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 176, 'release': 'Nothing.But.The.Truth.2008.DVDRip.XVID.AC3-TST', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 7.218165854763792, 'date': datetime.datetime(2009, 4, 16, 22, 19), 'highlight': False, 'user_name': 'ricklaferla', 'pack': False}
{'rating': 10, 'hash': u'b429ce5ebcea1c42755bd43fa6ed68ae', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 1732, 'release': 'Nothing.But.The.Truth.LIMITED.DVDRip.XviD.AC3-DEViSE', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 6.992931825456097, 'date': datetime.datetime(2009, 4, 28, 23, 0), 'highlight': False, 'user_name': 'ampg4', 'pack': False}
{'rating': 10, 'hash': u'c7d50e660aafc1ddca1cf3b79bdfcca4', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 256, 'release': 'Faces.da.Verdade.Dual.ptbr.eng.DvdRip.Xvid.Ac3.Brazilinjapan.by.cinefila', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 6.528455284552845, 'date': datetime.datetime(2010, 2, 25, 13, 19), 'highlight': False, 'user_name': 'cinefala', 'pack': False}
{'rating': 10, 'hash': u'46e2c9179ecc6e0fdd75912fd37b9814', u'language': 'pb', 'title': u'Nothing_But_the_Truth', 'downloads': 388, 'release': 'Faces.Da.Verdade.DVDRip.Dual.XviD.MP3-ZAMENGO', 'flag': 'http://i.legendas.tv/idioma/icon_brazil.png', u'score': 6.179487179487179, 'date': datetime.datetime(2010, 2, 3, 21, 50), 'highlight': False, 'user_name': 'jcbandeira', 'pack': False}
2015-02-28 15:53:02,400 NOTIFY Downloading 'Nothing.But.The.Truth.2008.DVDSCR.XviD-ARiGOLD' from 'alcobor'
2015-02-28 15:53:02,400 DEBUG Downloading archive for subtitle from /downloadarquivo/fe1a037d75cf46a3c27d82e3e0fe22d6
2015-02-28 15:53:04,107 DEBUG Using cached file
2015-02-28 15:53:04,107 DEBUG Archive saved as '/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030.rar'
2015-02-28 15:53:04,108 DEBUG 2 files in archive 'alcoborc87ce56448f623c506eb6f3e6bf4b030.rar': [u'Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt', u'Legendas.tv.txt']
2015-02-28 15:53:04,108 INFO 1 extracted files in '/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030.rar', filtered by [u'srt']
u'/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030/Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt'
2015-02-28 15:53:04,109 DEBUG Arguments: Namespace(backup=True, blacklistfile=u'/home/osmc/.config/legendastv/srtclean_blacklist.txt', encoding=None, fallback='windows-1252', in_place=True, loglevel=20, output_encoding=u'UTF-8', paths=[u'/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030/Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt'], rebuild_index=True, recursive=False)
[DEBUG] Arguments: Namespace(backup=True, blacklistfile=u'/home/osmc/.config/legendastv/srtclean_blacklist.txt', encoding=None, fallback='windows-1252', in_place=True, loglevel=20, output_encoding=u'UTF-8', paths=[u'/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030/Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt'], rebuild_index=True, recursive=False)
2015-02-28 15:53:04,109 INFO Processing subtitle: '/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030/Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt'
[INFO ] Processing subtitle: '/home/osmc/.cache/legendastv/archives/alcoborc87ce56448f623c506eb6f3e6bf4b030/Nothing But The Truth 2008 DVDSCR XviD-ARiGOLD.srt'
2015-02-28 15:53:04,116 DEBUG Auto-detected encoding: 'iso-8859-1'
[DEBUG] Auto-detected encoding: 'iso-8859-1'
2015-02-28 15:53:04,171 NOTIFY DONE!
[NOTIFY] DONE!
I have experimented some changes that solves the issue but right now I have no time to describe/discuss them. I'll keep you informed
No, it had enough information to pick a better candidate before the download. In your case the 2nd candidate should've been chosen. It scored really close to the 1st, but the 1st scored a few extra points because it is a highlighted
subtitle.
The solution is simple: fine-tune the weights in rankSubtitles()
. Try promoting similary from 5 to 6 and demoting highlight from 2 to 1 and see if it picks the right candidate.
It's easier to give an example:
Given my configuration:
similarity = 0.94
And for the given input file:
Deliver us From Evil 2014 SWESUB 720p BluRay x264 Mr Stiffy
I'm getting a subtitle that's only 0.5 similar:
dict: {'best': {'compare': u'Deliver Us From Evil LIMITED DVDRip XviD iMBT', 'full': u'/home/osmc/. cache/legendastv/archives/UnitedTeam-correcaobafb93ccc6c52925a9581b3e882ee403/Deliver.Us. From.Evil.LIMITED.DVDRip.XviD-iMBT By UNITED4EVER/Legendas Comuns/Com It\xe1licos/Deliver.Us. From.Evil.LIMITED.DVDRip.XviD-iMBT.srt', 'original': u'Deliver.Us.From.Evil.LIMITED.DVDRip.XviD-iMBT. srt'}, 'similarity': 0.5}