edit4ever / script.module.zap2epg

zap2epg - EPG grabber for USA/Canada
GNU General Public License v3.0
41 stars 25 forks source link

EPG not updating past 10/27 except for 1 channel #45

Open squirtbrnr opened 2 years ago

squirtbrnr commented 2 years ago

My EPG data is not updating. It was working fine for the longest time. I fetch 2 weeks of EPG data, however lately every channel was able to fetch up to 10/27, then after that there's nothing. Except for one channel, I have data for that channel out the full 2 weeks. I verified Zap2it has EPG data that far out for all of the channels. It seems either something is not being parsed or is malformed and screwing up the EPG.

Let me know what logs or data is needed.

squirtbrnr commented 2 years ago

So I did a little more digging. Here's the url the python script accesses. I've pared it down and removed some of the blank options (userid, device, etc). Also the script calls "http://" but the website is actually "https://"

https://tvlistings.zap2it.com/api/grid?lineupId=&timespan=3&headendId=lineupId&country=USA&postalCode=53051&time=1634908457

the max you can set timespan= is 6, for 6 hours. I'm wondering if they "fixed" the output of this url. It used to give the full 2 weeks of data like the website, and now it looks like you are limited to the time span of 1-6 hours.

EDIT: but you can increase the epoch timestamp at the end of the url by 6 hours (21600 seconds) and grab the next 6 hours of data.

squirtbrnr commented 2 years ago

And now I see in the zap2epg.py script it does the increasing epoch time, but for some reason this does not appear to be working because it's not getting the future listings.

edit4ever commented 2 years ago

Can you supply the zap2epg log? It is in the userdata folder for zap2epg.

edit4ever commented 2 years ago

Also - I was working on some other issues before - but haven't pushed a new release - but if you like, you can test this version to see if it works for you:

https://www.dropbox.com/s/pjpzi18vuldu0bj/script.module.zap2epg-2.04-TEST.zip?dl=0

squirtbrnr commented 2 years ago

script.module.zap2epg.zip Attached is not just the log but the xml file and the channels, etc files.

I run this script slightly different. I run TVHeadend in a docker container on my Synology. And this script runs in that docker container. My Kodi boxes are all RPi and only run Kodi (LibreELEC). I originally used this add-on to configure the settings, but then took the settings file and moved it and the script to the appropriate folders on my TVHeadend install. TVHeadend is still Python 2, not Python 3. What I've done is taken your zap2epg.py v1.3.1 and merged the few changes not related to python 3 from v2+ into my version of the script. Anyway, I'll merge your v2.0.4 into mine and test it out if you want me to.

th0ma7 commented 2 years ago

@squirtbrnr just so you know, the SynoCommunity package version I maintain comes with a fork of this project that I did, specifically for this use case. I'm actually about to update the package over the week-end where tv_grab_zap2epg is directly built into the "EPG Grabber Modules" page (accessible in expert mode) within tvheadend.

Otherwise you can look into using https://github.com/th0ma7/script.module.zap2epg/tree/Python3-th0ma7-updates instead.

Perhaps someday we might merge things up but the use-case was sort of different than what @edit4ever was looking into covering.

squirtbrnr commented 2 years ago

@th0ma7 thanks. I'm not interested in changing my TVH setup right now. Not to mention I can't get to the SynoCommunity (it's down or can't access the repo and it's blank in package center and DNS lookups are failing). EDIT: apparently I need to update to the latest DSM 6 because a trust certificate has expired.

Anyway, sounds like what I've done is similar to how your modified script works. Just for some reason, mine is not grabbing, parsing, whatever the full EPG.

edit4ever commented 2 years ago

The log and xmltv file look good - are you not seeing it import correctly in tvheadend?

What does the tvheadend log show when you run the grabber?

squirtbrnr commented 2 years ago

Correct, TVHeadend imports up through 10/27 for all channels then stops. Except for 1 channel, CW 18 (18.1) which has the full EPG for 2 weeks. TVH log seems to be ok and shows it imported data, but every time it runs the grabber (I have it scheduled to run at 0:04 and 12:04 every day) it says it only imported a certain number of shows that gets smaller every time it runs.

edit4ever commented 2 years ago

It's possible there was bad data somewhere that stopped the processing loop - but that should have shown in the log.

You could try to delete all the xxxxxx.json.gz files in the cache - not the SH or MV files - and see if that resets the import.

edit4ever commented 2 years ago

BTW - your xmltv.xml file you sent has data past the 27th for other channels...so something on import into tvh must be getting messed up.

squirtbrnr commented 2 years ago

I cleared the entire cache and ran the epg grabber again and got the same results. I did notice I had data past the 27th in the xmltv file but wasn’t sure if it was valid or invalid data. It does sound like this is most likely a TVHeadend issue with import. It could still be a problem with the data in the xmltv file.

Kensit commented 2 years ago

Note that I suspect this user's problem is related to the issue I opened. Just guessing, as the date/time that the bad data in Zap2IT was on the 27th for my Zap2IT file. Check the programming in their listings for "DC's Legends of Tomorrow" and see if the episode title contains some gibberish including an xml open and close bracket (<>). I lay money this is your problem.

squirtbrnr commented 2 years ago

@Kensit looks like that is the case. Here’s a screenshot of that episode epg in tvheadend. And that show is on the one channel that actually has the full 2 weeks of epg, but all other stations show blank.

715EFBEA-5F34-4B49-AC91-ED7DD8D764DE

edit4ever commented 2 years ago

Yep - they messed up that guide listing as it shows as wvrdr_error_100<oest-of-th3-gs.gid30n> not found on the zap2it listing site.

Since a show should never have a < or > in the title - the code doesn't deal with that.

If you're using the old 1.31 version as a base and want a temp fix for this case - look for this line: if edict['eptitle'] is not None: fh.write('\t\t<sub-title lang=\"'+ lang + '\">' + re.sub('&','&amp;', edict['eptitle']) + '</sub-title>\n') and change to this: if edict['eptitle'] is not None: fh.write('\t\t<sub-title lang=\"'+ lang + '\">' + re.sub('>','&gt;',re.sub('<','&lt;''re.sub('&','&amp;', edict['eptitle']))) + '</sub-title>\n')

That should fix this temporarily. Not the cleaneast way to do this - but shortest change for you.

edit4ever commented 2 years ago

The better fix - if you're running the older 1.31 version ... would be to change this whole section: if edict['eptitle'] is not None: fh.write('\t\t<sub-title lang=\"'+ lang + '\">' + re.sub('&','&amp;', edict['eptitle']) + '</sub-title>\n')

To this: if edict['eptitle'] is not None: showTitle = re.sub('&','&amp;', edict['epshow']) safeTitle = re.sub('[\\/*?:"<>|]', "_", showTitle) fh.write('\t\t<title lang=\"' + lang + '\">' + safeTitle + '</title>\n')

That should fix most title errors in the future. The actual entire zap2epg needs a rewrite - but I don't have time to work on this...and fortunately some others have started to pick up the work.

Kensit commented 2 years ago

I think a better improvement to this situation (and the way I chose to resolve the issue for the future) was add the following routine:

# Adjusting text for symbols before writing to xml file
# (fix syntax errors)
def xmlText(s):
    fxdxml = re.sub('&', '&amp;', s)
    fxdxml = re.sub('<', '&lt;', fxdxml)
    fxdxml = re.sub('>', '&gt;', fxdxml)
    return fxdxml

And then, everywhere in the code that had re.sub('&','\&', x) was replaced with xmlText(x).

This is how any programmer would handle making sure that all text written to an xml file is clean syntactically.

jbreen60030 commented 2 years ago

Actually, to solve for my older version (0.7.4 Python 2) I included in the "include block"

import cgi 
from cgi import escape

and then everywhere that the re.sub('&','&amp',x) existed, I changed it to x = cgi.escape(x, quote=true)

This will handle the <. > , & along with any other HTML chars that need to be escaped including the double quote. I also, for titles got rid of the escaping of the double quote char otherwise you end up with \&amp which just looks funny. e.g.

xthisTitle = edict['eptitle']
xthisTitle = xthisTitle.replace('\\\"','\"')
fh.write('\t\t<sub-title lang=\"'+ lang + '\">' + cgi.escape(xthisTitle, quote=True ) + '</sub-title>\n')
squirtbrnr commented 2 years ago

This is great. I will implement the changes into my version and see what happens. One other thing I noticed, and this may or may not be related to escaping characters in text, is in the descriptions for an episode I see “\u007C” instead of “|” (vertical pipe). But there are other places where the vertical pipe is shown. Specifically it seems to happen when it appends either the audio stream (Stereo) or the episode date in the description.

squirtbrnr commented 2 years ago

made the modifications to the script, cleared the log and xmldv file, cleared the cache folder, deleted and recreated my TVH container... nothing changed, Still stuck with the bad description. What am I doing wrong or where is TVH getting this old/stale data from?

Rippert commented 2 years ago

@squirtbrnr , I followed @Kensit instructions, and it seemed to fix the identical problem I was having. Which instructions did you follow? They should all work, but I can only personally vouch for Kensit's.

It seems like you are doing a lot of extra steps which I am not sure are helping. I only needed to clear the cache folder, edit the zap2epg.py script and re-run my internal grabbers in TVH. I'm really not clear what you mean by recreating the TVH container, sounds like starting from scratch with some kind of Docker container, but that doesn't make sense to me for this problem.

I should note that I am still getting the sub-title:

wvrdr_error_100 not found

but the description now contains the proper season and episode (rather than e104) and TVH has it scheduled for recording as it is supposed to every week for me. Also all other channels have proper guide data for the next 2 weeks again. Here is what my description looks like for this episode now:

DC's Legends of Tomorrow NEW "wvrdr_error_100 not found" | Season 7 - Episode 3 | First aired: October 27, 2021 Gideon becomes overwhelmed by her new human choices, sending her into a catatonic state; Astra and Spooner combine their powers to enter Gideon's mindscape and discover that a virus is trying to erase all of Gideon's memories. TV-PG | STEREO | CC

Yours may look different depending on how you have your zap2epg options setuo.

Rippert commented 2 years ago

Sorry, just noticed you are using a Docker container, so I guess you're editing whatever you use to create that and then recreating the container.

It did occur to me that you might be looking at your Kodi Guide data rather than the TVH EPG from the TVH web interface. If so, try going into the Live tV and PVR settings in Kodi and clearing the guide and cache data. Kodi is supposed to automatically reload it all at that point, but I always have to restart Kodi to get the reload to happen, so you may too. That should sync up Kodi with TVH.

squirtbrnr commented 2 years ago

My TVH guide data is now blank. I made the changes @jbreen60030 recommended and I get nothing in guide data. the log didn't have any errors in it, but the xmltv file was very small and missing everything except episode numbers. I will need to restore a backup copy and try @Kensit changes. I'm not a python programmer, but I can work my way through reading and modifying the code.

squirtbrnr commented 2 years ago

Ok I just implemented @Kensit changes. I now have guide data. However I am still getting the \u007C instead of vertical pipe in the description text.

squirtbrnr commented 2 years ago

I fixed the vertical pipe issue I was having. lines 619-626 of zap2epg.py used to have a "u" preceding the \u007C or \uXXXX for the character. In the GitHub master branch it is there, but in the release 2.0.0-2.0.3 it is not.

EDIT: this must be a difference between python2 and python3.

goddahavit commented 2 years ago

I have been good up until today now I get nothing for epg has been working with the occasional reboot and rerun the epg update. What file and where do I add kensit fix? I am a bit older but I can update to 1.3.1 if necessary.