edit4ever / script.module.zap2epg

zap2epg - EPG grabber for USA/Canada
GNU General Public License v3.0
41 stars 25 forks source link

Direct import to tvheadend? #34

Closed th0ma7 closed 3 years ago

th0ma7 commented 3 years ago

Would it be possible to run this python script directly on the tvheadend server-side, with a username/password file, in order to drop an xmltv output file locally then import it using internal EPG grabber using something similar to https://github.com/Rigolo/tv-grab-file ?

edit4ever commented 3 years ago

It can run server side...it was built to be installed directly in Kodi running on the same system as the tvheadend server - like a LibreELEC system.

It's just a matter of getting a few files in the right place and then it can run. There is a grabber that is part of this that will show up in tvheadend...but some code may be need to be customized for the install location.

Is your tvheadend running on a standalone server or nas?

th0ma7 commented 3 years ago

I'm the tvheadend maintainer for SynoCommunity packages (including ffmpeg) for Synology NAS and currently planning the migration path towards DSM7 major upgrade.

As such I'm currently evaluating options to migrate away from the older (and unmaintained) perl parser which is rather tricky to install due to the necessity of various CPAN modules needing to be compiled on the target.

I've created a zap2it package (pending release) with objective of packaging a python3 compatible zap2it importer (https://github.com/SynoCommunity/spksrc/pull/4608). I was focusing on https://github.com/daniel-widrick/zap2it-GuideScraping but it has major limitations that I was trying to fix (PR currently pending for merge).

Your project looks way more promising and hey, I prefer integrating software than writing code :)

I would gladly replace current scrapper with yours in my pending package if this was deemed feasible.

edit4ever commented 3 years ago

OK - it should be possible - the biggest challenge will be how to set it up so people can change the configuration. Currently, the kodi addon has the ability to change the options for zap2epg.py - the grabber only tells tvheadend to run zap2epg.py and then import the xmltv.xml file when finished.

So - we'll have to consider how your users can manage the zap2epg options.

Also, what path are you installing the grabber in?

th0ma7 commented 3 years ago

I will have to play with you scripts to better understand how it works. I believe it is much more complete than what I'm using but on the other hand integration might a little trickier to do (but hopefully feasible).

The current wrapper is rather simple:

The way I've setup things currently using that other wrapper and synology package toolkit:

  1. at installation time (and upon reinstalling) the web GUI ask the user for: username, password, # of days, zip or postal code, etc... I can ask what ever is needed. For instance I could ask for TVH port, username & password as well if needed.
  2. from there at post-installation the script adjust the default configuration file based on user input
  3. then it automatically install a cron job to update the EPG on a regular basis
  4. output file is dropped into a location accessible to TVH
  5. TVH in turns grabs it for processing using an internal EPG grabber

The path on synology is rather different than what linux is usually. Directory structure is set so each application is under its own directory structure such as:

In the case of the zap2it package:

All in all, I would need some hands-on with your code to get a good grip at it. Any piece of advice on where to start to manually generate an xmltv output file using your project? Unless it works differently such as it can be used directly as a TVH internal EPG grabber?

SawKyrom commented 3 years ago

How difficult would it be to stripe down the current kodi version of zap2EPG to a bare bones linux version. I have TVHeadend running in LXC container and would like to use your program to scrape XML data. I've used zap2it for years, but the perl version of zap2xml does not work in the Debian environment. I was hoping your cleaner zap2EPG would be the trick. I've tried running it but have gotten numerous Traceback errors:


Executing "/usr/bin/tv_grab_zap2epg"
2021-06-30 00:21:46.298 spawn: Traceback (most recent call last):
2021-06-30 00:21:46.298 spawn:   File "/opt/script.module.zap2epg-2.0.3/script.module.zap2epg/zap2epg.py", line 815, in 
2021-06-30 00:21:46.298 spawn:     logging.basicConfig(filename=log, filemode='w', format='%(asctime)s %(message)s', datefmt='%Y/%m/%d %H:%M:%S', level=logging.DEBUG)
2021-06-30 00:21:46.298 spawn:   File "/usr/lib/python2.7/logging/__init__.py", line 1554, in basicConfig
2021-06-30 00:21:46.300 spawn:     hdlr = FileHandler(filename, mode)
2021-06-30 00:21:46.300 spawn:   File "/usr/lib/python2.7/logging/__init__.py", line 920, in __init__
2021-06-30 00:21:46.300 spawn:     StreamHandler.__init__(self, self._open())
2021-06-30 00:21:46.300 spawn:   File "/usr/lib/python2.7/logging/__init__.py", line 950, in _open
2021-06-30 00:21:46.300 spawn:     stream = open(self.baseFilename, self.mode)
2021-06-30 00:21:46.301 spawn: IOError: [Errno 13] Permission denied: '/opt/script.module.zap2epg-2.0.3/script.module.zap2epg/zap2epg.log'
2021-06-30 00:21:46.305 spawn: cat: /opt/script.module.zap2epg-2.0.3/script.module.zap2epg: Is a directory

Any feedback would be much appreciated. Thanks!

th0ma7 commented 3 years ago

@SawKyrom you can give a shot at the change I am proposing part of this PR #37

It's currently available under my repo until it gets hopefully merged in a form or another back here. Have a look at https://github.com/th0ma7/script.module.zap2epg/tree/Python3-th0ma7-updates

It may need adjustment on BaseDir detection for LXC as it currently planned to work on a RasbPi or Synology NAS.

SawKyrom commented 3 years ago

@th0ma7 thank you for the quick reply and work you've done with the program. I did review your proposals and mergers in https://github.com/edit4ever/script.module.zap2epg/pull/37 and the https://github.com/th0ma7/script.module.zap2epg/tree/Python3-th0ma7-updates. I will try the newest branch and look at the program for any conflicts with Debian 10 base system.

The previous script (original) appears to have crashed in the basicConfig hdlr and StreamHandler, but I'm not certain why based on the message. Any thoughts on this error?

Also, could you tell me a little more on why the BaseDir needs adjusting? If all else fails, I'll create a separate VM for Debian, install Kodi and TVHeadend and then add your Kodi version of zap2epg. I've spent 20 hrs trying to make this work, so I might just be at a point where I adopt this structure instead, but I'm also quite stubborn and refuse defeat. I was hoping to avoid having to create a VM just to scrape XMLTV data.

Thanks again. I'll post any solutions if discovered.

SawKyrom commented 3 years ago

@th0ma7 I was able to get the script to work using your recent fork/merger!!! Thank you for the contributions! I updated python version and changed the BaseDir to the correct path for my application. The previous failure was either a result of Python 2.7, older code, or both. Now I need to figure out why the output was empty:

2021-06-30 03:27:45.889 xmltv: /usr/bin/tv_grab_zap2epg: grab /usr/bin/tv_grab_zap2epg
2021-06-30 03:27:45.889 spawn: Executing "/usr/bin/tv_grab_zap2epg"
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg: grab took 0 seconds
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg: parse took 0 seconds
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg:  channels   tot=    0 new=    0 mod=    0
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg:  brands     tot=    0 new=    0 mod=    0
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg:  seasons    tot=    0 new=    0 mod=    0
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg:  episodes   tot=    0 new=    0 mod=    0
2021-06-30 03:27:45.896 xmltv: /usr/bin/tv_grab_zap2epg:  broadcasts tot=    0 new=    0 mod=    0`

Cheers!

th0ma7 commented 3 years ago

I'd be interested to know what ended-up being your BaseDir so I can add a workaround for this use case. For the reason why it turned up doing nothing, have a look at the detailed log file locate under $BaseDir/log.

Allowing anonymous access to TVH may be what's missing for you. I had to allow 0.0.0.0/0,::/0 for * user. Have a look at https://github.com/edit4ever/script.module.zap2epg/pull/37#issuecomment-863163246

SawKyrom commented 3 years ago

@th0ma7 I changed the BaseDir to map the file location in the Debian LXC container.

BaseDir="/home/script.module.zap2epg-P3/epggrab"
CacheDir="$BaseDir/cache"
ConfDir="$BaseDir/conf"
ConfFile="$ConfDir/zap2epg.xml"
LogDir="$BaseDir/log"
LogFile="$LogDir/zap2epg.log"
XMLTV="$CacheDir/xmltv.xml"

I was able to see no changes to the $BaseDir/log when I run EPGgraber in TVHeadend, just the output listed below.

Screenshot (134)

However, if I run the program in CLI of host device, I do get output with xmltv data to xmltv.xml (located in $BaseDir/cache).
I did allow anonymous access to TVH with base user and also included admin login credentials in the zap2epg.xml file. I would guess it's an issue with TVHeadend settings since the program works with CLI.

Curious, is the Resource folder only for Kodi application in the master folder? It has a settings.xml. Would that file be applicable in my case? Thanks!

SawKyrom commented 3 years ago

@edit4ever Thanks for creating this program. I've been working with the @th0ma7 fork for his Synology application. I'm trying to run a simplified version running TVHeadend in Proxmox Debian LXC container. I've been unable to locate any articles/messages regarding this type of setup, but based on your previous comments, it sounds feasible.

I'm having some trouble getting the files in the correct place. Maybe you can give me some direction that would benefit anyone else who comes across this link.

What files are absolutely necessary for a Linux (non-Kodi) application? I'm assuming tv_grab_zap2epg and zap2epg.xml. Anything else? Is the default.py script necessary or only applicable for Kodi application. I noticed a channel list/grab function in the default.py. Will TVheadend work without this create_cList() function?

Where would you suggest placing the zap2epg.xml file or any other dependencies? My options are /home/hts/.hts/epggrab, /home/hts/.xmltv, and /usr/share/tvheadend/data/conf/epggrab. I've tried all of these, but as I'm unfamiliar with the TVHeadend architecture, you may know specifically.

TVHeadend pulls tv_grab_zap2epg from the /usr/bin location and is recognized in the Internal EPG grabber modules. However, when I run the EPG update in TVHeadend (Re-run Internal Grabber), I get no output (see below).

Screenshot (134)

If is run the zap2epg program by CLI in usr/bin location with attributes (i.e. zap2epg --days 1 --postal 81681 I do get data parsed in the STDOUT, but no xmltv.xml file. zap2epg.log shows the following errors at the end, where I assume it should be creating a xmltv.xml file:

2021/06/30 22:52:18 Parsing SH00150449.json
2021/06/30 22:52:18 Exception: xmltv
Traceback (most recent call last):
  File "<string>", line 432, in xmltv
  File "/usr/lib/python3.7/codecs.py", line 898, in open
    file = builtins.open(filename, mode, buffering)
IsADirectoryError: [Errno 21] Is a directory: '/home'
2021/06/30 22:52:18 zap2epg completed in 114.06 seconds.
2021/06/30 22:52:18 Exception: main
Traceback (most recent call last):
  File "<string>", line 835, in mainRun
NameError: name 'stationCount' is not defined

I also can not select any EPG source in TVHeadend GUI. Selection area in edit mode is completely blank. Reboots have no effect.

Screenshot (150)

I know it is alot of questions, but any guidance or direction would be much appreciated. Thank you all!!!

th0ma7 commented 3 years ago

@SawKyrom

What files are absolutely necessary for a Linux (non-Kodi) application? I'm assuming tv_grab_zap2epg and zap2epg.xml. Anything else?

Only tv_grab_zap2epg and zap2epg.xml are needed.

Is the default.py script necessary or only applicable for Kodi application.

No

Where would you suggest placing the zap2epg.xml file or any other dependencies?

Currently its directory structure is meant to reside under epggrab folder of TVH. But after testing things up this won't work for you. Create the following directory tree under /home/hts and adjust permissions:

# mkdir -p /home/hts/zap2epg/conf
# mkdir -p /home/hts/zap2epg/log
# mkdir -p /home/hts/zap2epg/cache
# chown -R hts:hts /home/hts/zap2epg
# chmod -R 0755 /home/hts/zap2epg

Copy the configuration and adjust permissions:

# cp zap2epg.xml /home/hts/zap2epg/conf
# chown hts:hts /home/hts/zap2epg/conf/zap2epg.xml
# chmod 644 /home/hts/zap2epg/conf/zap2epg.xml

Copy the script to /usr/local/bin and adjust permissions:

# tv_grab_zap2epg /usr/local/bin
# chmod 755 /usr/local/bin/tv_grab_zap2epg

Add the following ([ "$(id hts)" ] && BaseDir="$HOME/zap2epg") to the script /usr/local/bin/tv_grab_zap2epg such as:

# Default path for RasbPi Kodi+TVH
BaseDir="$HOME/script.module.zap2epg/epggrab"
# If running on synology NAS assume TVH only
[ "$(uname -a | grep -i synology)" ] && BaseDir="$HOME/var/epggrab/zap2epg"
# If running on common linux under hts user
[ "$(id hts)" ] && BaseDir="$HOME/zap2epg"

Lastly, ensure you have an anonymous user * with proper access. It minimally requires:

Also, you may want to play with the EPG setting in TVH for this grabber. Personally I use:

You should now be able to test:

$ sudo su -s /bin/bash hts -c '/usr/local/bin/tv_grab_zap2epg --days 1 --zip 81681'

First run is really really long, I use 14 days and first time building the cache is just awful.

Eventually in TVH after hitting "Re-run Internal EPG Grabbers" you should end-up getting something similar to:

2021-07-01 07:01:50.221 spawn: Executing "/usr/local/bin/tv_grab_zap2epg"
2021-07-01 07:02:28.924 xmltv: /usr/local/bin/tv_grab_zap2epg: grab took 39 seconds
2021-07-01 07:02:29.204 xmltv: /usr/local/bin/tv_grab_zap2epg: parse took 0 seconds
2021-07-01 07:02:29.205 xmltv: /usr/local/bin/tv_grab_zap2epg:  channels   tot=   40 new=    0 mod=   40
2021-07-01 07:02:29.205 xmltv: /usr/local/bin/tv_grab_zap2epg:  brands     tot=    0 new=    0 mod=    0
2021-07-01 07:02:29.205 xmltv: /usr/local/bin/tv_grab_zap2epg:  seasons    tot= 1114 new=   37 mod= 1077
2021-07-01 07:02:29.205 xmltv: /usr/local/bin/tv_grab_zap2epg:  episodes   tot=  942 new=   27 mod=  915
2021-07-01 07:02:29.205 xmltv: /usr/local/bin/tv_grab_zap2epg:  broadcasts tot= 1171 new=   38 mod= 1127

I know it is alot of questions, but any guidance or direction would be much appreciated. Thank you all!!!

Hope this helps.

Note: I tested it on a Debian VM. Let me know if this works out so I adjust the code in consequence.

edit4ever commented 3 years ago

While I don't have a lot of time to support this package - let me try to clarify some things.

zap2epg.py can run as a standalone file - but it looks for settings.xml for it's configuration information. That is the file that you adjust for your lineup information and different detail layout. The xmltv.xml file will output in the same location as the zap2epg.py file if run manually.

The tv_grab_zap2epg file goes into the location of the other tvheadend grabbers - or in simple linux installs usr/bin or /usr/local/bin That file should be edited to whatever 'home' directory you installed zap2epg.py to in order for tvheadend to find it and run it. In this case ADDON_HOME and ADDON_DIR can be set to the same location.

That is all that should be necessary for a basic install and run.

I am hoping to review the recent pull requests - but also happy for someone else to take this project and run with it. :-)

SawKyrom commented 3 years ago

Thank you both for your input! @edit4ever I really appreciate your time and explanation on the file structure. The program has such potential with my application. Previously I was using zap2xml.pl, but there is an issue pulling data from zap2it and your script will no doubt be the solution. Kudos!

@th0ma7 Wow! Thank you for the very detailed walk-through and contributions to this project! I know you've spent some time and I can't thank you enough. I followed your steps with great care and this is what I discovered.

In my case, the tv_grab_zap2epg needed to be placed in /usr/bin directory. TVHeadend did not recognize it when placed in the /usr/local/bin directory thus supporting edit4ever statement.

The tv_grab_zap2epg file goes into the location of the other tvheadend grabbers - or in simple linux installs usr/bin or /usr/local/bin

I adjusted the BaseDir as suggested and commented out the none applicable paths:

# Default path for RasbPi Kodi+TVH
# BaseDir="$HOME/script.module.zap2epg/epggrab"
# If running on synology NAS assume TVH only
# [ "$(uname -a | grep -i synology)" ] && BaseDir="$HOME/var/epggrab"
# If running on common linux under hts user
[ "$(id hts)" ] && BaseDir="$HOME/zap2epg"

CacheDir="$BaseDir/cache"
ConfDir="$BaseDir/conf"
ConfFile="$ConfDir/zap2epg.xml"
LogDir="$BaseDir/log"
LogFile="$LogDir/zap2epg.log"
XMLTV="$CacheDir/xmltv.xml"

All other settings were changed and verified such as EPG setting in TVH for this grabber and anonymous user * with proper ALL access. I changed the settings file (zap3epg.xml) to 2 days. After a reboot, I selected the applicable grabber from TVHeadend EPG Modules and initiated a run within TVH and not CLI. I was able to verify a large increase in network activity (assuming data fetch) and 15 min later, I obtained the following TVHeadend log.

2021-07-01 17:56:56.000 xmltv: /usr/bin/tv_grab_zap2epg20210701: grab /usr/bin/tv_grab_zap2epg20210701
2021-07-01 17:56:56.000 spawn: Executing "/usr/bin/tv_grab_zap2epg20210701"
2021-07-01 18:12:15.686 xmltv: /usr/bin/tv_grab_zap2epg20210701: htsmsg_xml_deserialize error Unknown label referense: "& Icons Network
2021-07-01 18:12:15.686 xmltv: /usr/bin/tv_grab_zap2epg20210701: grab returned no data

The EPG source is still empty with "Channels" tab in TVHeadend. However, the /home/hts/zap2epg/cache directory shows many episode.json files and a xmltv.xml file with listings. The .../log is present with the following last few lines:

2021/07/01 18:12:15 Parsing SH00002527.json
2021/07/01 18:12:15 Creating xmltv.xml file...
2021/07/01 18:12:15 Writing Stations to xmltv.xml file...
2021/07/01 18:12:15 Writing Episodes to xmltv.xml file...
2021/07/01 18:12:15 zap2epg completed in 919.48 seconds. 
2021/07/01 18:12:15 75 Stations and 4980 Episodes written to xmltv.xml file.

Being that the data is not recognized in TVHeadend as you demonstrated in your output log above, suggest that maybe the cache file path is not in the correct location? Any thoughts?

I have the following file structure with Debian linux container:

root@TVHeadend:/home/hts# ls -a
.  ..  .hts  .xmltv  zap2epg
root@TVHeadend:/home/hts# cd .hts
root@TVHeadend:/home/hts/.hts# ls
tvheadend
root@TVHeadend:/home/hts/.hts# cd tvheadend
root@TVHeadend:/home/hts/.hts/tvheadend# ls
accesscontrol  bouquet  channel  config  dvr  epgdb.v2  epggrab  input  passwd  profile  superuser
root@TVHeadend:/home/hts/.hts/tvheadend# 

Do you think I should move the zap2epg folder containing cache and log to anoter location (i.e. .hts or one of its subdirectories)?

Once more, thank you all for your time and attention. I know this has help me tremendously and will also help many others. Cheers!

edit4ever commented 3 years ago

This is actually a simple fix. It appears one of the channels that you are importing has an ampersand in the name & Icons Network which causes tvh to not import the data.

Assuming you are running the th0ma7 version - change line 347 in the tv_grab_zap2epg file from: fh.write('\t\t<display-name>' + convTitleExcept(xchnam) + '</display-name>\n') to fh.write('\t\t<display-name>' + convHTML(convTitleExcept(xchnam)) + '</display-name>\n')

That should solve the ampersand issue and it should load in.

SawKyrom commented 3 years ago

Brilliant!!! Definitely solved the xmltv: /usr/bin/tv_grab_zap2epg20210701: htsmsg_xml_deserialize error Unknown label referense: "& Icons Network error.

If anyone gets the following error after making the previous changes:

2021-07-01 19:27:08.454 spawn:   File "", line 308
2021-07-01 19:27:08.454 spawn:     fh.write('\t\t' + convHTML(convTitleExcept(xchnam)) + '\n')
2021-07-01 19:27:08.454 spawn:     ^
2021-07-01 19:27:08.454 spawn: IndentationError: expected an indented block

It is because I used the dreaded "tab" for aligning indentations. Must use spaces only if editing with nano. Rookie move!

The working output is as follows with still no episode listings:

2021-07-01 19:34:26.376 xmltv: /usr/bin/tv_grab_zap2epg20210701: grab took 58 seconds
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701: parse took 0 seconds
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701:  channels   tot=   75 new=   75 mod=   75
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701:  brands     tot=    0 new=    0 mod=    0
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701:  seasons    tot=    0 new=    0 mod=    0
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701:  episodes   tot=    0 new=    0 mod=    0
2021-07-01 19:34:26.389 xmltv: /usr/bin/tv_grab_zap2epg20210701:  broadcasts tot=    0 new=    0 mod=    0

However, I now have EPG sources listed in TVH Channels tab!!! I don't know if it was because I copied every zap2epg folder containing previously fetched data to each and every TVH subdirectories or a result of the ampersand issue above. Both changes occurred concomitantly. Now that I have channels to use for mapping, I'll check if there is a specific directory TVH likes for cache/xmltv files.

You guys rock!

edit4ever commented 3 years ago

If you now have epg sources and they are matched up to their channels - just rerun the internal grabber and it will load in!

th0ma7 commented 3 years ago

Closing the issue. Created a new fork for this at https://github.com/th0ma7/tv_grab_zap2epg