garybuhrmaster / tv_grab_zz_sdjson_sqlite

XMLTV grabber for Schedules Direct JSON service
GNU General Public License v2.0
6 stars 8 forks source link

Incorrect category_type for generic episodes #9

Closed bennettpeter closed 7 years ago

bennettpeter commented 7 years ago

I don't know whether this issue is in this grabber or the json source. Feel free to reject it back at me if it is in the source data feed so that I can open a ticket with SD.

I have noticed that the category_type for generic episodes is always being set as "tvshow", while identifiable episodes of the same show are set as "series". The result is that everything in the program table has generic = 0 in spite of numerous generic episodes existing. The scheduler flag "identifiable episodes only" has no effect.

Refer mythfilldatabase/main.cpp:452 where it marks generic episodes.

garybuhrmaster commented 7 years ago

On Sat, Jun 10, 2017 at 3:55 PM, Peter Bennett notifications@github.com wrote:

I don't know whether this issue is in this grabber or the json source. Feel free to reject it back at me if it is in the source data feed so that I can open a ticket with SD.

I have noticed that the category_type for generic episodes is always being set as "tvshow", while identifiable episodes of the same show are set as "series". The result is that everything in the program table has generic = 0 in spite of numerous generic episodes existing. The scheduler flag "identifiable episodes only" has no effect.

Refer mythfilldatabase/main.cpp:452 where it marks generic episodes.

I'll take a look (hopefully in the next few days, I have some other things I have to get done). If I remember correctly (and I probably do not, there is far too much code there to remember) the defined entityType for (what you know are generics) is the same as for any random other single/standalone TV shows (i.e. it is marked as a Show, and not as an Episode (which turns into the MythTV series). There may be a way to derive the information from other source fields (I'll have to look at the SchedulesDirect API docs (which is not as good as they should be, I sometimes have to go to the source Gracenote API docs)). Of course, fixing one thing may result in breaking another.

btw, while I do not think it is related to this issue, but it may end up screwing up some scheduling, Schedules Direct is having issues with the upstream Gracenote data for the past few days, so the grabber may be getting "interesting" data (in particular, I am aware that some channel, scheduling, and program data is simply missing or incomplete). Gracenote is "working" the problem (and the claim is Gracenote management is engaged (which often slows things down, but at least it makes it visible)).

garybuhrmaster commented 7 years ago

I believe I have (somewhat) improved the category calculation in 15877e9 using the show subtype. Please test when you have the opportunity. There are still cases where the information available is insufficient to fully categorize the show.

bennettpeter commented 7 years ago

I ran a test and it looks reasonable, but I am only getting 7 days worth of data from SD. The generic episodes that were concerning me are further out, so I am waiting for SD to sort out their issues before testing any further.

garybuhrmaster commented 7 years ago

I ran a test and it looks reasonable, but I am only getting 7 days worth of data from SD. The generic episodes that were concerning me are further out, so I am waiting for SD to sort out their issues before testing any further.

Thanks for the update. It is perfectly fair/reasonable to wait for SD to get their data corrected for more complete testing results (in some cases (with some lineups?) the sqlite cache should have more than 7 days of data, but I am not at all sure if any of the cached data should be considered accurate at this point (it sort of looked possibly ok, but???), so I agree that it is absolutely correct to wait and see).

Please try to remember to update when you have a more complete test results.

garybuhrmaster commented 7 years ago

Just as a FYI, with the patch, I see a reduction of around 70% of "tvshow" to a more appropriate selection (often "series", but sometimes "sports" or "movie"), from ~134K to ~36K programs for my particular lineup. A random check of (a few of) the remaining "tvshow" indicated most were for "to be announced", shopping (paid programming), and programs marked as specials. And MythTV currently shows around 100K generics. Whether those are reasonable numbers, or will lead to better scheduling, I will await your results.

bennettpeter commented 7 years ago

Schedules direct has caught up and I have moved the latest tv_grab_zz_sdjson_sqlite to my primary master backend. Everything looks good now. The generic episodes that were incorrectly scheduled are now gone from upcoming. Looking at the database, the generic flag looks appropriate. From my point of view everything is correct now. Thank you.

garybuhrmaster commented 7 years ago

Thanks for testing. I am closing this issue as resolved.