XMLTV / xmltv

Utilities to obtain, generate, and post-process TV listings data in XMLTV format
GNU General Public License v2.0
269 stars 93 forks source link

Follow up to #170 tv_grab_uk_tvguide timezone error during grabbing #172

Closed misar1 closed 2 years ago

misar1 commented 2 years ago

There seems to be something odd with the fix for this error.

Until last week my 7 day EPG downloads started at Midnight on the morning of the download and finished at Midnight on the evening 7 days later. Today's download using the revised script started at 6am this morning but still finishes at Midnight in 7 days time.

I saw honir's comment about the website day starting at 6am but this does not explain the sudden change. It seems very unlikely that tvguide.co.uk changed their EPG day because of the move to BST. Even if the assumption were correct the current 7th day should end at 6am on Mon 4 Apr not Midnight on Sun 3 Apr as it does.

honir commented 2 years ago

I can't reproduce your problem. If I ask for "--days 1" then I get today's listing from 00:00 - 23:59.

Without any knowledge of exactly what fetch command you are running it is hard to be precise.

A MWE [1] would be helpful.

. . [1] Minimum Working Example - a cut-down representative example for, ideally, one channel that is consistently repeatable

misar1 commented 2 years ago

This is my regular command line: XMLTV.exe tv_grab_uk_tvguide --days %1 --nodetailspage | XMLTV.exe tv_sort --by-channel --output D:\VB6\WinTV\WinTV6\EPG\tvguide_XMLTV-%1.xml

I have attached a download for a channel which broadcasts a full sequence of programmes all night and used a 2 day EPG to show that only the current (first) day starts at 6am. After that all programmes for each 24 hour period are recovered. tvguide_XMLTV-2.xml.txt

honir commented 2 years ago

In another post you said you were running XMLTV 1.0.0.

1) When did you upgrade to XMLTV 1.1.1?

2) How did you install the 'timezone fix' version of tv_grab_uk_tvguide?

misar1 commented 2 years ago
  1. I originally tried the timezone fixed script yesterday with Ver 1.0.0 but received an error message that a newer XMLTV version is required. It may have specified 1.0.1 but I cannot be sure. I then downloaded and ran the latest Windows xmltv.exe which is Ver 1.1.1.

  2. I obtained the updated script from GitHub I used that version of the script to replace the version which running XMLTV 1.1.1 for the first time left in the cache (\AppData\Local\Temp\par-6d696b6573\cache-0403651db53b5d86433fa2980270b3bd52f1e717\inc\script).

On reflection that was probably when I looked for script version information. I assume the first run of XMLTV.exe downloads the cache so it might already have obtained the 27 March script. In fact it found a 39.4 KB version which is much smaller than the current one (49 KB).

honir commented 2 years ago

You might have multiple ...\Temp\par-XXXXX directories (I think each different xmltv.exe will make it's own).

In any event you now have a version number ;) so xmltv.exe tv_grab_uk_tvguide --version will tell you if it is running the correct one.

Run it without the sort filter to avoid any confounding issues. E.g. XMLTV.exe tv_grab_uk_tvguide --days 1 --nodetailspage --debug --output tvguide_test.xml

misar1 commented 2 years ago

The new xmltv.exe did create a new cache but I deleted the old one.

xmltv.exe tv_grab_uk_tvguide --version Timezone is +0100 XMLTV module version 1.1.1 This program version : tv_grab_uk_tvguide 2022-03-29 15:00:00

I tried your command line: XMLTV.exe tv_grab_uk_tvguide --days 1 --nodetailspage --debug --output tvguide_test.xml Timezone is +0100 Fetching https://www.tvguide.co.uk/channellistings.asp?ch=559&cTime=03%2F28%2F2022%2000%3A00%3A00 from server. Fetching https://www.tvguide.co.uk/channellistings.asp?ch=559&cTime=03%2F29%2F2022%2000%3A00%3A00 from server. Exiting without warnings. The .xml file is as before, starting at 6am tvguide_test.xml.txt

Also you originally suggested the sort because it adds the stop times which are otherwise omitted by the --nodetailspage option. Even if it fixed the download this would be a problem. I have two TV guide programs (only one written by me) which don't work without the stop times. I would need to write a pre-processor to put them back in the .xml file. Possible but I would prefer to avoid that complication.

misar1 commented 2 years ago

Update. I tried the same command line without the --nodetailspage option. That fixes the problem: tvguide_test2.xml.txt

Unfortunately downloading a 7 day EPG for a reasonable number of channels (I usually have 60) is so slow that it needs to run overnight. When I tried it last year I could not find anything useful in the .xml that is missing with the --nodetailspage option other than the stop times.

honir commented 2 years ago

MWE:

create tvguide.conf

cachedir=c:\temp\cache
channel=559

run xmltv.exe tv_grab_uk_tvguide --config-file tvguide.conf --days 1 --nodetailspage --output tvguide.xml

honir commented 2 years ago

The software works fine for me. I get a complete day starting at midnight.

Is this is a real Windows machine, or is it a virtual container?

If you want to prove that the timeone change is NOT the source of your problem you can use the previous version of this grabber from here.

misar1 commented 2 years ago

Using a real Windows machine. Will try the suggestions later, at present my regular grab without --nodetailspage is running. It appears it will finish around 13.00 UTC.

misar1 commented 2 years ago

I tried the MWE exactly as you described it AND used a second Windows PC with XMLTV installed. It was updated on Monday to 1.1.1 with the first fixed script (committed on 27 March) but not used since.

The result is the same as before - grab starts at 6am this morning tvguide.xml.txt

I then replaced the grabber with the previous version you linked above and repeated exactly the same command line. This time the grab starts at 00:00 this morning tvguide2.xml.txt

This seems to link the problem to the fixed script (committed on 27 March) .

rmeden commented 2 years ago

I updated the alpha-exe today.

honir commented 2 years ago

There is little I can do, since I can't replicate your problem.

The change you are referencing involved changing the order of two lines of code. Nothing more. No new code.

. . @rmeden Are you able to try the MWE above and see if you can replicate the issue @misar1 is experiencing? (It doesn't happen on my windows 7 box.) I don't know if the website is accessible from outside UK.

misar1 commented 2 years ago

The first PC finally completed the grab without --nodetailspage (it took 8 hours instead of 20 min!) so I also reverted its grabber back to the one you linked above. Using my original command line (with sort) the .xml was back to normal with this morning's programmes starting at 00:00. Nothing else on the PC was changed.

I will continue with 1.1.1 and that grabber. Am I correct in assuming it is the second one you committed on 24 February?

Thanks for your support and a great grabber regardless of the odd niggle. If you ever get bored it would be nice to have an option which runs almost as fast as --nodetailspage but includes Stop times without needing to sort (and then fix the missing stop for the final programme per channel).

honir commented 2 years ago

What timezone does your PC say it is?

misar1 commented 2 years ago

Time Zone GMT Summer Time

misar1 commented 2 years ago

My Perl experience is zero but I looked at the small change you made to the code on Sunday. I have marked in bold two lines that stood out to me although this may be totally irrelevant.

  1. You moved the key 06:00-06:00 line from after the time set to before it which could have the effect I observed.
  2. The comment line about no prog 'stop' time available suggests this section of code is used with --nodetailspage.

                        $showtime = $theday->clone;
                        $showtime->add(days => 1) if ($h < 6);      # site runs from 06:00-06:00 so anything <06:00 is for tomorrow
                        **$showtime->set(hour => $h, minute => $i, second => 0);**
                        $showtime->add (days => 1) if ($h < 6);     # site runs from 06:00-06:00 so anything <06:00 is for tomorrow
                        $prog{'start'} = $showtime->strftime("%Y%m%d%H%M%S %z");
                        **# no prog 'stop' time available**
                    }
honir commented 2 years ago

Does your code really look like you posted? Viz.:

$showtime = $theday->clone;
$showtime->add(days => 1) if ($h < 6);      # site runs from 06:00-06:00 so anything <06:00 is for tomorrow
$showtime->set(hour => $h, minute => $i, second => 0);
$showtime->add (days => 1) if ($h < 6);     # site runs from 06:00-06:00 so anything <06:00 is for tomorrow
$prog{'start'} = $showtime->strftime("%Y%m%d%H%M%S %z");
# no prog 'stop' time available

If so, then you can see you are performing " add (days => 1) " twice. This will certainly cause the error you are seeing.

misar1 commented 2 years ago

Sorry, should have explained I copied it from the March 27 commit details on here but the -/+ red/green highlighting disappeared when I pasted it! I wanted to highlight that the move was after/before the $showtime set. I have not touched your scripts (apart from adding a commit date comment to the older ones like 24 Feb).

honir commented 2 years ago

Can anyone reproduce this issue?

Specifically by using the latest Windows program

The MWE above should download a complete 24 hours' worth of programmes starting at 00:00. (The issue is the programmes from 00:00-05:59 are displaced by one day.)

misar1 commented 2 years ago

I have identified the problem and it was my fault.

I checked the downloaded grabber version that cause the problem and it DOES have the moved line in twice. I deleted the original copy of the line and the EPG now starts at 00:00 as it should.

The explanation is the same as for my post with the code extract. The only way I have found to save these scripts from my browser is to copy and paste into an editor. Normally I do that from the raw version but for some reason last Monday I must have used the initial view with the commit details. Hence the paste had the same effect as the one you noticed.

My abject apologies for wasting your time. I will be more careful in future!

honir commented 2 years ago

Thanks for posting the resolution. I appreciate your honesty - I thought I was going mad. (Which is, of course, still possible anyway.)

I'm with you: it's not simple to update individual files from GitHub browser. The 'recommended' method is to right-click the 'Raw' link and click 'Save Link As…'.

On my machine I then need to delete the '.txt' extension that seems to get added by my browser. But that seems a fudge: there should be a simple Download button IMHO.