garybuhrmaster / tv_grab_zz_sdjson_sqlite

XMLTV grabber for Schedules Direct JSON service
GNU General Public License v2.0
6 stars 8 forks source link

"new" element not generated #10

Closed rpcameron closed 6 years ago

rpcameron commented 7 years ago

I've noticed that the grabber doesn't seem to be generating elements for schedules that have {"new": true} in their responses.

I've written a patch that changes how the element is generated, suppressing its creation for a schedule that has {"new": true} and generating elements instead.

(It also looked like the if/elsif/elsif structure of the code that generates the would never reach the check for whether the schedule was new or not. However, I'm not certain about this as I'm unfamiliar with perl.) tv_grab_zz_sdjson_sqlite-patch.txt [Edit: I've also embedded the patch below]

--- a/tv_grab_zz_sdjson_sqlite  2017-08-17 23:39:21.319147939 -0700
+++ b/tv_grab_zz_sdjson_sqlite  2017-08-19 22:18:27.781095306 -0700
@@ -1775,7 +1775,13 @@

         # XMLTV uses their standardized dates, while Schedules
         # Direct uses YYYY-MM-DD
-        if (defined($programDetails->{'originalAirDate'}))
+        # If "new", then do not generate previously-shown
+        if (defined($scheduleDetails->{'new'}))
+          {
+            my $new = $scheduleDetails->{'new'};
+            $w->emptyTag('new');
+          }
+        elsif (defined($programDetails->{'originalAirDate'}))
           {
             my $originalAirDate = $programDetails->{'originalAirDate'};
             my $offset = ' +0000';
@@ -1791,12 +1797,6 @@
             my $start = substr($originalAirDate, 0, 4) . substr($originalAirDate, 5, 2) . substr($originalAirDate, 8, 2) . $offset;
             $w->emptyTag('previously-shown', start => $start);
           }
-        else
-          {
-            my $new = 0;
-            $new = $scheduleDetails->{'new'} if (defined($scheduleDetails->{'new'}));
-            $w->emptyTag('previously-shown') if (!$new);
-          }

         # XMLTV premiere/last-chance is sort of arbitrarily
         # defined, so we decide on our own mapping (while
garybuhrmaster commented 7 years ago

Thanks for the report.

This is one of those cases where I think the xmltv dtd can not properly handle all real world desired definitional cases (and original air date and previously shown are not even always the same since the previously shown element can be channel specific as the dtd specifies), and there are known applications that depend on certain format of information for other side effects. That of course also ignores the reality that the Schedules Direct data does not always have a one-to-one translation to the xmltv dtd for many data elements, so the code does have to choose some heuristics, and I agree it may not always choose perfectly for all cases.

A "new" element (when the schedule is so marked) should be able to be added, but not all applications may process that element, and I am not currently convinced that suppressing previously-shown is the correct solution even if one adds a "new" (because it is not the same thing to say something is new for this channel, but was not previously shown (some cable channels are notorious for this delusion {"First time on TV!" when what they mean is first time on their channel))). I know of at least one application which actually treats previously-shown as originalAirDate (because that was as close of an element as the XMLTV dtd allowed, and it already had originalAirDate in the schema), and then adjusts new later.

What application are you using, and (if possible) do you have a specific schedule example (lineup/station/schedule/time/program), and what is not working well for you? It may also be necessary for the application to be updated.

==

btw, the if/elseif/else looks to be the correct construct in the original code (but that is from a 10 second scan. I'll look closer later).

rpcameron commented 7 years ago

I'm using Tvheadend. However, I noticed that the "new" element isn't being generated at all, even for items that have '{"new": true}' in their schedule. (I verified this by comparing the "details" blob in the "schedules" table of the SQLite database with the generated XMLTV. None of the generated programme elements contained a new element, regardless of the JSON. (Hence why I moved the check for "new" before any of the "originalAirDate" checks/generation.)

I am certain there are some deficiencies in how Tvheadend processes "previously-shown", as I believe I had problems with it under tv_grab_zz_sdjson, too. (I believe part of it is locale related, as I'm UTC-0700, so a UTC-based previously-shown may equate to the following day, especially for prime time programming.) Under the other grabber I made a similar change to the source to suppress previously-shown elements for new items.

I have most of my Autorecs (series/rule-based recordings) set to "New" only, which would not include those with a previously-shown element, even if the air dated matched it.

I only just switched to your grabber this week, so I figured I'd see what I could to make my transition smoother.

garybuhrmaster commented 7 years ago

I'm using Tvheadend. However, I noticed that the "new" element isn't being generated at all, even for items that have '{"new": true}' in their schedule.

First, thanks for bring the issue to my attention, and for the patch (although it seems you used some previous revision of the grabber to generate it; your upstream, or your upstreams upstream should consider upgrading, since the newer revisions have additional capabilities).

It turns out, that I found a completely different dtd compliance issue while looking at the "new" case, which was separately patched (so your request was good for others).

After doing some review (and even doing some completely different preliminary coding to work with the current master code), I more carefully read the xmltv specification. I am going to copy below so I can comment on it.

<!-- New.  This is the first screened programme from a new show that
has never been shown on television before - if not worldwide then at
least never before in this country.  After the first episode or
programme has been shown, subsequent ones are no longer 'new'.
Similarly the second series of an established programme is not 'new'.

Note that this does not mean 'new season' or 'new episode' of an
existing show.  You can express part of that using the episode-num
stuff.
-->

That explicitly states that a new episode of an existing program should not have the tag. Yet at least a small sample of the cases I looked at the Schedules DIrect data has '"new": true' are used for new episodes of an existing program.

So a Schedules Direct "new" is not usable for xmltv "new" (completely different things), and since that would explicitly violate the xmltv specification, it will not be implemented (I am a standards stickler).

I would suggest that you can perhaps propose to xmltv project to expand the dtd to add additional fields (a "new-showing" field?), at which point (if it is accepted) I can provide a dtd specification compliant field.

I'll note that another application "fudges" its internal use of the previously-shown dates to sort of mean "recent", and marks all programs whose previously-shown date is within a few days of the stated program start to mean a new showing of an episode. Perhaps tvheadend needs to use such logic. Here is a completely untested patch to tvheadend that might do what you want tvheadend to do. You should, perhaps, propose it to them via their bug tracking solution (likely the range (hard-coded at 10 days in this patch) should be user customizable), where I presume their experts can review the concept, and the implementation, and perhaps suggest a better way (in particular, this patch only deals with xmltv data, and there is also potentially eit data to consider, so a more general solution somewhere might be appropriate). I have little knowledge of tvheadend or its project(s) approaches.

--- a/src/epggrab/module/xmltv.c
+++ b/src/epggrab/module/xmltv.c
@@ -510,6 +510,9 @@ static int _xmltv_parse_programme_tags
       htsmsg_get_map(tags, "new"))
     save |= epg_broadcast_set_is_new(ebc, 1, &changes);

+  if (abs(difftime(start, first_aired)) < 864000)
+    save |= epg_broadcast_set_is_new(ebc, 1, &changes);
+
   /*
    * Episode/Series info
    */

Lastly, if you know of any xmltv grabber which fills in the "new" field incorrectly (for new episodes) please let the project know so the author can correct the errors.

Thanks. Sorry if this is not exactly the result you might have been hoping for.

rpcameron commented 7 years ago

Thank you for your efforts into this. I know the problem is actually two-fold.

Firstly, the DTD is ambiguous in several issues. This is most evident in the previously-shown element, but also in the (semi-)related premiere. Plus add in the issues of locales/timezones, and how this data is presented or interpreted can vary across implementations.

Secondly, the problem is that Schedules Direct's data points do not map directly 1:1 to the XMLTV elements. While it is nice to have a standard of sorts, XMLTV is not quite as robust as most users need, nor what a PVR needs. What is truly needed is an internal EPG module for Tvheadend that works directly with Schedules Direct, rather than trying to use the limited XMLTV format as an intermediary; that however is outside the scope of your project.

I want to thank you for taking the time to investigate my issue. I'll look into the sources for the patch you mentioned, and probably tweak it a little to better suit my needs. (This won't be a huge issue, since I basically already maintain a small patchset against their sources to better support my tuner that isn't quite supported upstream.) As far as my patches against your sources, they were made against your master branch directly when I made them. If there are additional capabilities, perhaps I need to rebase. (The current release of the xmltv-utils—or whatever the package may be called—is using a rather old version of your sources from nearly 12 months ago; that was the reason I based my patches against your sources directly.)

garybuhrmaster commented 6 years ago

As it is agreed that this is an XMLTV (lack of feature) specification issue (and outside the scope of this grabber) I am going to close this issue.

rpcameron commented 6 years ago

No problem. I understand, and agree that the issue is with XMLTV's vague standard. Thanks for the consideration.

garybuhrmaster commented 3 years ago

Tangentially related, but the latest grabber versions now have an option to provide some additional metadata (including a "new-episode" flag for the "schedulesdirect" system) that some apps have chosen to utilize if the grabber has been configured to emit those additional elements/attributes (via the episode-num element). I am aware that MythTV master now can utilize at least some of the values for (potentially) better scheduling.