Closed NeilBetham closed 8 years ago
For the moment I've had to shutdown my post processor since it can't identify anything TV related while TVRage is down.
You could just disable the TVRage postprocessing in the config, that's what I've done.
While yes that is an option for the short term. In the long term another solution will be needed in order to fix the problem. Additionally if you disable it TV Rage post processing then the indexer becomes useless for any sort of TV releases since most searchers depend on the rage id of the newznab API spec.
I'll investigate this later today. As you said, most of the downloaders rely on the tvr id, so we might have to wait and see what they swap to.
Incidentally, sonarr uses name only, I think.
On Tue, 2015-09-22 at 21:29 -0700, Neil Betham wrote:
While yes that is an option for the short term. In the long term another solution will be needed in order to fix the problem. Additionally if you disable it TV Rage post processing then the indexer becomes useless for any sort of TV releases since most searchers depend on the rage id of newznab.
Indeed, I was suggesting it as a short term fix, to allow other postprocessing to continue. We will definitely need an alternative if TVRage does not return.
I did look into sonarr a bit and it looks like they will use a tv rage id if they have one for the show but if they don't then they will do search based on title, season and episode. Relevant code. Now when they have a rage id and when they don't, i'm still not sure yet.
We can also use thexem.de in the short term. Also can help us translate elsewhere.
On Wed, 23 Sep 2015 at 05:58, Neil Betham notifications@github.com wrote:
I did look into sonarr a bit and it looks like they will use a tv rage id if they have one for the show but if they don't then they will do search based on title, season and episode. Relevant code https://github.com/Sonarr/Sonarr/blob/master/src/NzbDrone.Core/Indexers/Newznab/NewznabRequestGenerator.cs. Now when they have a rage id and when they don't, i'm still not sure yet.
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-142491211.
Yeah, thexem might work.
I've moved on to SickRage (fork of SickBeard) as it doesn't use TVRage anymore (uses tvDB) and searches on title only. Doesn't work for "new releases, as in S01E01" though as the pynab api needs an "episodes" entry in releases/episodes tables. I got around that by tweaking tvrage.py to add them. I wont post the code as Murodese will do a better job of it. As a side note, python 3.4 can't use import db.regex as regex_data in util.py. I've commented out all use of it for now as it's only used to import pynab's own regexes, I think. Also been banned from the pre irc channel so be careful how you access it. I'm working on another one.
Doesn't work for "new releases, as in S01E01" though as the pynab api needs an "episodes" entry in releases/episodes tables.
Can you elaborate on that? Episode details should already exist as part of post-proc as per the episode table.
Ukharley, you can use the pre import script in the short term if you still need em. They are updated once a day usually. On 24 Sep 2015 1:04 am, "ukharley" notifications@github.com wrote:
I've moved on to SickRage (fork of SickBeard) as it doesn't use TVRage anymore (uses tvDB) and searches on title only. Doesn't work for "new releases, as in S01E01" though as the pynab api needs an "episodes" entry in releases/episodes tables. I got around that by tweaking tvrage.py to add them. I wont post the code as Murodese will do a better job of it. As a side note, python 3.4 can't use in util.py. I've commented out all use of it for now as it's only used to import pynab's own regexes, I think. Also been banned from the pre irc channel so be careful how you access it. I'm working on another one.
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-142764785.
Hmm, thexem doesn't let you do show name lookups. Would have to combine tvdb or something with thexem to get tvrage IDs until the downloaders are updated to work with something new. Let me look around and see what the downloaders are doing first, though.
@ brookesy2: I am importing them for now but a second choice irc would be worthwhile, for me anyway. @ Murodese: If there is no rage it doesn't process episodes.
Definitely agreed!! I really need to do more research on requests and how they work, so we can replicate that ourselves. On 24 Sep 2015 10:15 am, "ukharley" notifications@github.com wrote:
@ brookesy2: I am importing them for now but a second choice irc would be worthwhile, for me anyway. @ Murodese: If there is no rage it doesn't process episodes.
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-142864533.
Also if you need help implementing stuff I would be game to contribute where needed.
TVRage seems to have an interim page up saying they will be back. So this may resolve it self. Though given the instability of TVRage's platform, and the duration they have been down people may just contribute to another website like TVDB or TVMaze; adding another source for info may still be worth while for post processing. As far as downloaders go It looks like at least Sonarr was planning on falling back to just doing title searches for shows that had no rage id.
@brookesy2: Unfortunately, you can't get to the pre irc channels without an invite and key. I did find one that I've used in my supplimentary prebot script here: ..... found a bug. Will update when done! It only gathers pre's for active groups (much more to my liking). If that's not what's wanted comment out the relevant lines in the on_pubmsg function or add a flag in config.py and this script to only process active or all. For the moment, SickRage is working and being actively developed (in python as well) so I'll be sticking with it, I get an update notification almost every day.
@ukharley Weird. I just restarted my bot and it rejoined and was fine. Mostly irrelevant, as you are right, we definitely need a backup! Thanks for putting some time into it. Total legend :)
2015-10-01 06:40:49 INFO pre: Inserted/Updated - Penn.Zero.Part-Time.Hero.S01E20.720p.HDTV.x264-W4F 2015-10-01 06:40:52 INFO pre: Inserted/Updated - Sick_Of_It_All-Scratch_The_Surface-WEB-1994-ENTiTLED_iNT
Sorry, had to rewrite a work project almost entirely so I've been busy as heck for a few weeks.
The only reason I went with TVR to begin with was SB's insistence on using tvrids as a data source. Sonarr's always fallen back onto title searches, I think.
I'd be totally amenable to modifying the schema so that any kind of metadata can be associated to a release (so each can have a tvr, tvdb, anidb, imdb, whatever ID attached to it). Will require a fairly chunky rework of postprocessing, but a lot of what I've been doing lately has been massive-load distributed processing so I have some new optimisations that I can work into pynab to speed it up quite a lot. I also want to do some more work on better recognition of movie releases, since that seems to be something that pynab's not great at.
@ukharley let me know when the IRC script's done and I'll merge the two (or just submit a PR, either way). From what I saw yesterday it looked pretty good!
I think the "nuked" bug is sorted in the backup irc bot. New Gist: https://gist.github.com/ukharley/13251d904b7cfcac2e59
Does this effectively replace the old prebot?
I'm using it to replace the existing one as for some reason I was banned from the other, probably restarting the bot too often when no updates were seen. Both bots suffer from a reconnect problem. I think it's down to only having one server in the list. When that one drops it just sits there doing nothing. It's capable of accepting a list [ ] of irc servers it can rotate through but the way it's implemented now it can't. I haven't had time to look any deeper into it.
As @ukharley said, its part connection part server rotate. I have a pretty good connection so it stays online for months sometimes. But a disconnect does hurt :( The library does attempt to re-connect, but sometimes it doesn't work.
Obviously the more the merrier in this instance, redundancy! :)
Just noticed that the newznab+ team have committed a change replacing TVRage with TVMaze.
Looks good. As I said, I'll replace it with something more generic that supports multiple providers, I just don't have time at the moment (PhD thesis is due in a little over 3 months).
@Murodese I am going to attempt to write this, but it may take longer than you to finish your thesis :)
Before I start, any preference on what I should use for the json returned. Do you have any library you would use? Or just load it up using standard json libraries from python?
Haha, we'll see :) have a look at what I used for the other libraries - I think it was either simplejson or just the standard python json library? On 16 Oct 2015 5:17 pm, "M B" notifications@github.com wrote:
@Murodese https://github.com/Murodese I am going to attempt to write this, but it may take longer than you to finish your thesis :)
Before I start, any preference on what I should use for the json returned. Do you have any library you would use? Or just load it up using standard json libraries from python?
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-148663275.
@Murodese Looks like just standard json library. Will use that for now :)
@brookesy, have a look here: https://github.com/srob650/pytvmaze See if you can adapt that one.
@ukharley This could definitely help, thanks!
The longest part will be figuring out wtf all this means :)
The hardest will be convering existing TVRage and creating new TVMaze db.
@ukharley For now im just going to work on stuffing tvmaze Id's into the old rage box :)
tvmaze does have a lookup based off rageid though. So in cases where things can't be found, we can fall back.
@ukharley python3 support for this isn't looking good :( Sad times.
Edit: Looking at the code I can probably update it for python3. But its lunch time now :)
Here's one i was playing with: https://gist.github.com/ukharley/d9ea6146fec8378e644d
@ukharley fortunately the other library was a 2 second fix. Will make a pull request for it later.
Edit: Perhaps I was too cocky! It doesn't work, I will remove it and work on it :)
@ukharley This seems to be working, in terms of just the lib. I need to make a pull request for the guy, cus I probably should't just steal this!
https://github.com/brookesy2/pynab/blob/development-postgres/lib/tvmazelib.py
Still messing with postprocessing.
Well, I accidentally ran this on my prod db. Will see how that goes....
@Murodese @ukharley So I ran tvmaze across a bunch of releases and its not too bad. We will need to modify the regex that splits out the names though. As the endpoint will NOT return results for things like "Flash 2014".
It is partly due to the endpoint I used, which is the fuzzy match one. We could use the more expansive one, which returns multiple results, and keep the existing regex. We would then need to do a secondary search. Bit more of a pain, but if you think its preferred I can attempt it!
Maybe try it and see how much of a pain it is to determine which of the multiple is the one we want?
@Murodese I think the bigger pain is splitting off the years for things like "Flash 2014" as that wont return any results on any endpoints for tvmaze. Would need to do Flash then search year on the result set.
Here is the example:
http://api.tvmaze.com/search/shows?q=flash
http://api.tvmaze.com/search/shows?q=flash%202014
Edit: Do we just assume that tv shows usually have a year on the end?
Just had a quick look at the TVMaze forum and they will never add the year or country to the title. That has to pulled from the json data, either/and/or
"premiered":"2014-10-07" "network":{"id":5,"name":"The CW","country":{"name":"United States","code":"US"
for the Flash example above.
@ukharley Yeah, I found the same :(
So does this mean we want to use the less restrictive search and match on premiered? We will need to modify the regex to extract show names without dates. How is your regex? :)
Sorry, I'm not working on it. Just read your comment and went for a quick gander at their forum. The only free time I have for programing is at weekends and my 65 yr old brain isn't as fast as it used to be so that slows things down as well :)
@ukharley Ha! don't worry about it at all mate. Such is life :) I will do my best, can't promise any speed though!
Had a few minutes spare in the office and the best I could come up with is:
(?P<name>.*?)(?<year>19|20\d{2})
Unfortunately, this will only work from 1900 tol 2099 ;)
@ukharley Nice work mate :) I am on holiday next week, if I can find some time I will try bust this out!
@Murodese If you get time, think you can chuck in an alembic for what you are after? tvmaze returns tvrage and thetvdb, so I can build that in to my code. I am not sure which are areas are touched by tvrage ID's though. Does the API get queried by them?
I'll do this over the next couple days. TVRage (and by extension IMDB) IDs are presented as part of the API, which is the only place they get used iirc. I think Sickbeard is the only application to directly query the TVR ID, as well. Sonarr and I think CouchPotato both query by name.
Wrote the generic ID table and migration tonight, needs more testing. Also needs tying in with post-processing, obviously. If you want to test the migration (USE A TEST DB, DO NOT RUN ON LIVE DB), it's in the genericid
branch.
Awesome, thanks mate! Will run it on my test DB when I get back from holidays. Will update tvmaze.py to get all the ID's. Then will have to look at everywhere else that references it :)
On Wed, 28 Oct 2015 at 15:53, James Meneghello notifications@github.com wrote:
Wrote the generic ID table and migration tonight, needs more testing. Also needs tying in with post-processing, obviously. If you want to test the migration (USE A TEST DB, DO NOT RUN ON LIVE DB), it's in the genericid branch.
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-151871049.
I'll do most of that, I have some spare time for a week or so :) I'll let you handle the TVMaze integration, though.
Total legend :) hopefully can wrap it up not too long after I get back from holiday! On Wed, 28 Oct 2015 at 16:57, James Meneghello notifications@github.com wrote:
I'll do most of that, I have some spare time for a week or so :) I'll let you handle the TVMaze integration, though.
— Reply to this email directly or view it on GitHub https://github.com/Murodese/pynab/issues/215#issuecomment-151891496.
With TVRage looking like it's down for good would it be possible to integrate a different TV release identification API? OMBD now supports TV shows, or
series
. And TVMaze also has an API for show searching. The latter is looking into adding TVRage IDs to any shows that it can find data for. This would likely also require some modification on the API side of pynab in order to support searching by different ID types. Also I know this won't affect most of the automated down-loaders since they still will likely depend on TVRage IDs but if there is an indexer that supports the new ID set then the down-loaders could follow suit. Truth be told a new API interface for down-loader to indexer communication, other than newznab, needs to be defined. For the moment I've had to shutdown my post processor since it can't identify anything TV related while TVRage is down.