XMLTV / xmltv

Utilities to obtain, generate, and post-process TV listings data in XMLTV format
GNU General Public License v2.0
269 stars 93 forks source link

tv_grab_zz_sdjson issues after Schedules Direct's changes #181

Closed kyl416 closed 10 months ago

kyl416 commented 2 years ago

Even after manually applying the patch in #180 to handle the 303 redirects, it seems something changed with how Schedules Direct's json data is formatted because all of my channels now have the wrong programs associated with them.

Like the listings for my Fox affiliate WNYW is showing random programs from my CBS affiliate, History, A&E, Univision, TLC, MeTV, etc. The weird thing, the values you get from the "schedules" fetch like start, stop, and "dd_progid" values are correct, while the rest of the details you get from the "programs" fetch like sub-title, title, desc, credits, length, etc belong to a completely different program. So it looks like there's some issue matching the correct program in the write_programme code.

i.e. At 10pm-11pm ET Fox has "The 10 O'Clock News", only the timeslot and dd_progid SH037385140000 are correct in my output, but the title, desc, credits, categories, length and rating are those of a random episode of "Animal Outtakes"

kyl416 commented 2 years ago

Also, it might be a good idea to fix the code so it doesn't wipe the existing xmltv file until it's ready to overwrite it with the final output, as with all the current server issues that is causing the grabber to give up, people will be left with a blank xmltv file until they apply the patch to handle 303 redirects and the servers stabalize.

garybuhrmaster commented 2 years ago

I'll note that even with the patch in #180 the grabber tends to produce a number of uninitialized variable messages. Some of those suggest insufficient data validation, which could result in various issues.

In another forum someone also mentioned that the schedule was wrong with the zz_sdjson grabber, but seemed to be correct (with very minimal testing) with the zz_sdjson_sqlite grabber. You may want to try that.

kyl416 commented 2 years ago

The problem is zz_sdjson_sqlite is not a one for one replacement that can work for everyone.

Like anytime you need to enable or disable a channel you have to go through a time consuming manage lineups/channels process to choose channels one by one, and start over if you make a mistake, while on zz_sdjson all you need to do is add or remove the stationId from the .conf file in a text editor. (i.e. if a premium channel like HBO is a free preview, you might want to temporarily enable those channels) And it doesn't have an option to use backwards compatible channel ids in the I*.labs.zap2it.com format, so long time users from the DataDirect era or people who migrated from tv_grab_na_dd would need to reconfigure their entire setup to switch over.

So hopefully someone can look at the code to figure out what's wrong, or schedules direct can document what changed on the json output so others have a starting point to try to submit a patch to fix it.

garybuhrmaster commented 2 years ago

So hopefully someone can look at the code to figure out what's wrong, or schedules direct can document what changed on the json output so others have a starting point to try to submit a patch to fix it.

The Schedules Direct endpoint stated what they did, that it is API compliant. Someone (thanks for volunteering!) will need to review the API and the code and what the code is doing, and determine where zz_sdjson is not following the API. We will await your patch.

kyl416 commented 2 years ago

Thanks, but unfortunately that's beyond my current skillset to do from scratch.

I was hoping someone could say what specifically changed in the json output, including what new fields were added, that might have ended up breaking a grabber that, while potentially non-compliant, has worked since 2014 and has been part of the xmltv project since 2016, so people who are not familiar with their API, but are familiar with xmltv, perl and/or json data can have a starting point to look for. Or if someone who is familiar with their API can take a look at the code to see what the original developer did wrong.

Like it could be a string vs integer issue or a case sensitivity issue that they are now enforcing, something more complex where the grabber needs to keep track of additional fields when doing the schedules to programs matching if the response no longer sends data back in a sequence/order that the code incorrectly expected, if one of the new fields in the json data matches something that the grabber's code uses as a variable, or it could have been something caused by the server load issues from yesterday like incomplete data that the grabber's code couldn't handle correctly.

The original author @kgroeneveld was active on github as recently as February, so hopefully someone can get in touch with him (I sent an e-mail to him last night to see if he can take a look), but he hasn't contributed code to the grabber since 2017, and his code has mostly been left untouched except for the rename from sd_json to zz_sdjson, bulk commits related to xmltv versioning in 2019, and a commit you made related to previously-shown/new back in 2020.

kgroeneveld commented 2 years ago

I was unaware there was any change to the SD service happening until these posts on github yesterday. I have done almost no work on this grabber since I wrote it as it just seems to keep working in my MythTV setup. I still seem to have 14 days of valid data in my MythTV system, but the grabber did fail to run last night. Trying to test it now it seems the whole SD service is down at the moment.

I do have some motivation to look into this further as, mentioned above, I use it in my own MythTV setup. But it is hard to say how much time I will have to work on it in the coming days. And currently SD seems to be offline so I cannot even work on it now.

Thank you to @garybuhrmaster who actually seems to spend more time on my grabber now than I do.

garybuhrmaster commented 2 years ago

Thank you to @garybuhrmaster who actually seems to spend more time on my grabber now than I do.

To be fair, I have only touched trivial edge cases, as I primarily am interested in (and develop) the alternative SD-JSON grabber, but once in a while I see something trivial to fix, so offer a PR.

In this case, there is something more basic going on. But as the service is currently offline, testing/debugging and resolution is going to have to wait.

honir commented 2 years ago

This, from the SD forum, may be relevant

2022-05-22T16:07Z - AWS Database Migration Service mis-identified primary and unique keys and did not transfer all data from source database to destination. Analyzing which tables are affected.

Perhaps kyl416's problem is a result of a SD database issue?

rmeden commented 10 months ago

I think this was a temporary SD problem. tv_grab_zz_sdjson is working for me.