garybuhrmaster / tv_grab_zz_sdjson_sqlite

XMLTV grabber for Schedules Direct JSON service
GNU General Public License v2.0
6 stars 8 forks source link

Problems when a channel's xmltvid changes #15

Closed bennettpeter closed 4 years ago

bennettpeter commented 4 years ago

Around October 31st channels 10 and 710 changed xmltvid's in my area. MythTV did not know this and was still loading from the old xmltvids for those two, which showed a bunch of spanish titles, I don't know where they came from since there should not have been another channel with those ids. In trying to fix it I came across some problems:

  1. In the sqlite database, channels 10 and 710 were no longer selected (the "selected" column was set to 0).
  2. After setting the selected column to 1 (update channels set selected = 1 where channum = [number]), and fixing the xmltvids in MythTV, I ran mythfilldatabase. No new data came for those channels. However, the next run the following morning got all the listings for those. This is not really a huge problem, just confusing as to why it happens this way, and what is the best approach for these problems.
garybuhrmaster commented 4 years ago

Thanks for the report.

  1. In the sqlite database, channels 10 and 710 were no longer selected (the "selected" column was set to 0).

Have you, for one reason or another, used the manage-lineup options to set future (new) channels to be unselected? If the channel changed, the code considers that a new channel, and would set the selection to whatever you have specified.

  1. After setting the selected column to 1 (update channels set selected = 1 where channum = [number]), and fixing the xmltvids in MythTV, I ran mythfilldatabase. No new data came for those channels. However, the next run the following morning got all the listings for those. This is not really a huge problem, just confusing as to why it happens this way, and what is the best approach for these problems.

This is likely due to optimization (and a bug?). The code will only download schedules for stations selected (to reduce the royal overhead), and it will only download schedules if they have changed (to reduce the royal overhead) so there was likely no data in the database for those channels to feed outwards until the next day's schedule changes for your lineup. The code in the app does attempt to force a download when you use it to change channel selections, but if it did not work, there must be a bug there, and I will have to look at the code again to see why it did not do what I intended it to do.

bennettpeter commented 4 years ago

Thanks for the report.

  1. In the sqlite database, channels 10 and 710 were no longer selected (the "selected" column was set to 0).

Have you, for one reason or another, used the manage-lineup options to set future (new) channels to be unselected? If the channel changed, the code considers that a new channel, and would set the selection to whatever you have specified.

I did set future channels to be unselected. However this channel is an existing one, which changed its xmltvid. I assume your code treats a new xmltvid as a new channel, which it was not, in this case.

  1. After setting the selected column to 1 (update channels set selected = 1 where channum = [number]), and fixing the xmltvids in MythTV, I ran mythfilldatabase. No new data came for those channels. However, the next run the following morning got all the listings for those. This is not really a huge problem, just confusing as to why it happens this way, and what is the best approach for these problems.

This is likely due to optimization (and a bug?). The code will only download schedules for stations selected (to reduce the royal overhead), and it will only download schedules if they have changed (to reduce the royal overhead) so there was likely no data in the database for those channels to feed outwards until the next day's schedule changes for your lineup. The code in the app does attempt to force a download when you use it to change channel selections, but if it did not work, there must be a bug there, and I will have to look at the code again to see why it did not do what I intended it to do.

Maybe it is my fault because I used sqlite3 to set the selected flag, rather than your app. I do it this way because of the tedium of going through all of the channels one at a time, when I only need to change two of them. Next time I will try using your app to set the channel selection.

garybuhrmaster commented 4 years ago

I did set future channels to be unselected. However this channel is an existing one, which changed its xmltvid. I assume your code treats a new xmltvid as a new channel, which it was not, in this case.

When changes happen, one must treat it as a new channel (there is no perfect answer here, as I have seen lineups use existing channels for entirely new content (new name, new xmltvid, new callsign, but same channel number). For OTA broadcasters the FCC has exercised a bit more control regarding reuse, but for cable systems, they sometimes do the strangest things.

That is why I nominally recommend against using new channel selection. Let the app deal with new/delete/changes because it may have the needed intelligence to prompt the user to make any decisions regarding what is what (again, some cable systems do the darnedest things).

Maybe it is my fault because I used sqlite3 to set the selected flag, rather than your app.

Yes, if you bypass the application logic, you own the breakage (although, as you noted, the code(s) do try to eventually fix things up and get back in sync (of course, no guarantees that it might not take a bit of time)). You can use --force-download to a manual invokation of the grabber to force it to download everything all over again by invalidating and deleting most of the cached content (it is sort of a sledgehammer approach, but sometimes you need a sledgehammer).

btw, for MythTV (and cable lineups), I nominally recommend using MythUtil-Channel-XMLTV-getLineup (instructions in the MythTV wiki) and consider running mythfilldatabase with --only-update-guide to have more control (and visibility) regarding the changes to your lineup. I personally run it semi-regularly (to catch the strays) and whenever I see a notice in my cable bill that something is changing (and sometimes I have to open a ticket with Schedules Direct because they do not yet have the updates). I will also note that for MythTV one of the developers has publicly discussed on (one of) the mailing lists their intent to improve some of the mythfilldatabase processes (as I recall, in his case, life happened, so I am not sure the current implementation schedules). TBH, I don't know if those changes will help your use case.

garybuhrmaster commented 4 years ago

Around October 31st channels 10 and 710 changed xmltvid's in my area.

What zipcode (and if you can check, what lineup name) are you using? That date range suggests potential fallout from one of the phases of the FCC OTA repack, and while there were efforts to try to minimize the impacts to cable customers numbers (cable channel 10 is still 10 after the repack), the guide providers have, for almost every phase, have had to deal with at least a few changes at the OTA level that end up changing some internal information that falls down to the cable lineups, such as name changes that end up making the channels "different" (i.e. changed), which flows down to this grabber, and eventually to MythTV (as it uses the data; MythTV has great flexibility in their channel setups to handle all the worldwide distribution schemes, but that flexibility can cause artifacts for certain, what may seem to be simpler, use cases). For example, changing the FCC callsign associated with the broadcaster from WABC-SD to WABC-LT or WABC-DT, or WABC-DT1, or ... which does not change the actual scheduled shows, does represent a change (and therefore "new") and (usually) results in new xmltvids along the way. It has been my observation that the OTA repack has resulted in some (long overdue?) cleanup in the guide providers database(s), but that cleanup work can result in issues downstream ("no good deed goes unpunished"?).

If you have a sufficiently old mythtv database backup from some time ago it would be interesting to see the rows for channels 10 and 710 (although that might be more work than it is worth) to see if it is possible to derive the changes that may have been introduced.

And FWIW, one of the features of the aforementioned MythUtil-Channel-XMLTV-getLineup program is that it shows those (callsign, name, xmltvid) changes more explicitly than the current mythfilldatabase code does.

bennettpeter commented 4 years ago

What zipcode (and if you can check, what lineup name) are you using?

My zipcode is 02532 and the lineup is "USA Comcast Abington - Digital USA-MA20453-X SD-JSON"

If you have a sufficiently old mythtv database backup from some time ago it would be interesting to see the rows for channels 10 and 710 (although that might be more work than it is worth) to see if it is possible to derive the changes that may have been introduced.

chanid, channum, freqid, sourceid, callsign, name, icon, finetune, videofilters, xmltvid, recpriority, contrast, brightness, colour, hue, tvformat, visible, outputfilters, useonairguide, mplexid, serviceid, tmoffset, atsc_major_chan, atsc_minor_chan, last_record, default_authority, commmethod, iptvid
'3010', '10', '10', '3', 'WBTSSD', 'WBTS-LD (SD Feed)', 's10991_h3_aa.png', '0', '', 'I101663.json.schedulesdirect.org', '0', '32768', '32768', '32768', '32768', 'Default', '1', '', '0', '0', '0', '0', '10', '0', '2018-03-22 01:59:57', '', '-1', NULL
'3710', '710', '710', '3', 'WBTSLD', 'WBTSLD (WBTS-LD)', 's101517_h3_aa.png', '0', '', 'I101517.json.schedulesdirect.org', '20', '32768', '32768', '32768', '32768', 'Default', '1', '', '0', NULL, '0', '0', '710', '0', '2019-08-15 02:00:01', '', '-1', NULL

This is the data in the latest xml file for those channels:

 <channel id="I112743.json.schedulesdirect.org">
   <display-name lang="en">WBTS-CD (SD Feed)</display-name>
   <display-name lang="en">WBTSSD</display-name>
   <display-name lang="en">10</display-name>
   <icon src="https://s3.amazonaws.com/schedulesdirect/assets/stationLogos/s10991_h3_aa.png" width="360" height="270" />
  </channel>
  <channel id="I91446.json.schedulesdirect.org">
   <display-name lang="en">WBTSCD (WBTS-CD)</display-name>
   <display-name lang="en">WBTSCD</display-name>
   <display-name lang="en">710</display-name>
   <icon src="https://s3.amazonaws.com/schedulesdirect/assets/stationLogos/s10991_h3_aa.png" width="360" height="270" />
  </channel>

So they have changed from WBTSSD, WBTS-LD (SD Feed) to WBTSSD, WBTS-CD (SD Feed) and from WBTSLD, WBTSLD (WBTS-LD) to WBTSCD, WBTSCD (WBTS-CD).

Strangely, rather than showing "NO DATA" for those channels, since October 31st I have been seeing a bunch of Spanish language programs listed. Looking at the latest xml file there is nothing listed for I101517.json.schedulesdirect.org or I101663.json.schedulesdirect.org .

garybuhrmaster commented 4 years ago

Strangely, rather than showing "NO DATA" for those channels, since October 31st I have been seeing a bunch of Spanish language programs listed. Looking at the latest xml file there is nothing listed for I101517.json.schedulesdirect.org or I101663.json.schedulesdirect.org .

This is just a guess, but on/about August 8th, 2019, WTBS and WYCN swapped callsigns, and then sometime in October WYCN moved its transmitter location, and switched its affiliation to Telemundo. I would not be at all surprised this resulted in some interesting guide provider opportunities(1), and as some channels may have schedule data quite a number of days into the future there may be some legacy data in the program tables until it ages out. I suppose I could ask for a copy of your current and legacy program tables and try to figure it out, but TBH, I am really not all that interested trying to reverse engineer what happened upstream at gracenote (and/or the broadcaster itself), so unless you think it is important, I am not going to even consider going there.

I think at this point I believe we understand the two issues you initially reported (it is considered a new channel when the upstream changes the name(2), and you specified new channels are not selected, and you manually updated the database bypassing the application logic (and I have verified that the codes do the right thing in a couple of test scenarios I ran when you use the app itself)). So, while the results were not exactly perfect for your use case, other than perhaps trying to improve the documentation, I do not really see something to change in the grabber itself(3). Do you concur?

(1) And this is (in essence) all a fall out of the entire mess caused when the owner of WHDH did not see the writing on the wall back in 2016 and sell to Comcast when the selling was good, and then the intermediary attempts by NBC to still cover the (greater) Boston area with channel sharing agreements and swapping of facilities. It is not as if the owner of WHDH could not have seen what happened with KRON which did not sell to Comcast in an almost equivalent situation.

(2) I understand that in this particular case with sufficient fuzzy compare it might be possible to decide the two names are close enough, but while that might work for FCC names, there are more than FCC names in the entire Gracenote database, even just for the US (and Gracenote also has worldwide guide data, which have their own challenges of changes and naming).

(3) Although I do look forward to changes in MythTV (and mythfilldatabase) itself. I would love to be able to retire my external adjustment scripts and have it all work auto-magically.

bennettpeter commented 4 years ago

Thank you Gary for your detailed analysis.

I disable the option for automatically adding new channels because I record some old series that are in rerun. For example "Midsomer Murders" is in rerun on PBS as well as on Ovation. I recently downgraded my Comcast plan to one that excludes Ovation. If Ovation got automatically added, MythTV would try to record Midsomer Murders from there and fail.

For cable lineups, it would suit me better if the criterion for detecting a new channel was channel number rather than callsign. However, for other situations, such as over the air lineups, that may not work. Of course, those setups probably require a scan and manual work in MythTV when there is a change.

I don't think that a fuzzy compare on callsigns will be a good idea. There seems to be little logic behind the naming. Also for some reason they have different xmltvids for the SD and HD versions of a channel and the call signs are similar.

I don't think any further action is needed on this.