Closed jhenstridge closed 1 year ago
Hello @jhenstridge
It's a known issue... https://github.com/xbmc/xbmc/issues/19845
With a fix merged a couple of weeks ago... https://github.com/xbmc/metadata.tvshows.themoviedb.org.python/pull/88
Is it the same fix as yours?
My tvshow.nfo
file consisted of a plain URL rather than being XML, so I don't think that bug is related. And removing the lid
parameter seemed to fix the problem.
Here is an example of the regexp matching the URL I mentioned:
>>> import re
>>> url = 'http://thetvdb.com/?tab=series&id=204781&lid=7'
>>> match = re.search(r'(thetvdb)\.com[\w=&\?/]+id=(\d+)', url)
>>> match.groups()
('thetvdb', '7')
>>>
There's no series with id=7, so it fails to match the series.
Looking closer at the themoviedb scraper, it looks like it is using a different regexp to extract the ID from this style of URL compared to this plugin:
https://github.com/xbmc/metadata.tvshows.themoviedb.org.python/blob/matrix/libs/data_utils.py#L52
It seems to handle URLs with the lid parameter correctly:
>>> re.search(r'(thetvdb)\.com.+&id=(\d+)', url).groups()
('thetvdb', '204781')
>>>
Oh, sorry. You were reporting a Parsing NFO issue. I must have been half asleep and just jumped to a known similar issue.
@pkscout will check it out.
This is the TVDB scraper, right? That's not the one I'm maintaining. I know we just updated the scrapers using the movie database with some different regex parsing to handle old TVDB urls better, and I think the comments here indicate the TVDB team need to update their scraper as well to deal with older URL formats from their site.
Yep. I am using the "The TVDB v4" scraper. This looked like the right place to file the bug report, and the code here seems to match the behaviour I observed.
This has been internally ticketed for review - https://mediamorph.atlassian.net/browse/TVD-3391
I was testing out the plugin with my library, and found it failed to scan some files with
tvshow.nfo
files in them. These included old thetvdb.com URLs of the form:These seemed to work with the old scraper maintained by the Kodi team. Having a look through the source code, the URI seems to match this regexp:
https://github.com/thetvdb/metadata.tvshows.thetvdb.com.v4.python/blob/10cb449edbc76d3e85b383b2dea5cbf8302600b8/metadata.tvshows.thetvdb.com.v4.python/resources/lib/nfo.py#L20
... but it extracts the
lid
value rather thanid
. Editing the file and removing the&lid=7
bit at the end of the URL allows the scraper to recognise the series.I think changing the regexp to something like the following would work:
That is, requiring a
?
or&
immediately before theid
parameter.