File locations for the new version 3 API

hikavdh commented 8 years ago

I am currently playing around with an install script and I have some questions I like to hear opinions about. There are several file groups that have different properties and can go to different locations. I give what I now think about where to place them.

The 3 API scripts, they now go into the python tree in the tvgrabpyAPI directory
The pickeled text files, they now go into the python tree in tvgrabpyAPI /texts
The frontend script tv_grab_nl3.py. I will let the python installer handle that. Under Linux in /usr/bin and under Windows ...
The configuration files, I'll leave them in ~/.xmltv. But there is something to say about putting them in /etc/tvgrabpyAPI. But then where under Windows? And we need a group to create write access. See also the next item.
The source files, this is tricky. They will mainly get downloaded on need by the user, but they also come with the installation. So they should be in a location where the user has rw permissions. ~/.xmltv/sources would then be logical. But the installation will have to run as root, so how to determine what user. Should we ask during installation? Another logical location could be /var/lib/tvgrabpyAPI. But then we should set up a group for the user to become a member of. Or check on common mythtv, tvheadend,... groups. Another problem is Windows users. There is no /var/lib?
Last there are some scripts that will be in the package, but not in the installation.

winfried commented 8 years ago

On 05/25/2016 04:04 AM, Hika van den Hoven wrote:

Hi Hika,

You are aware of: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard ?

I am currently playing around with an install script and I have some questions I like to hear opinions about.

Are you using setuptools or a home-brew script? See also: http://python-packaging-user-guide.readthedocs.io/en/latest/installing/

I ask this to avoid conflicts between package managers. When you use the package manager of the OS, you would install in /usr tree. If you use pythons setuptools, then it would decide on its own corder of the tree to install in. If you use something homebrew, then /usr/local would be the logical place.

The pickeled text files, they now go into the python tree in tvgrabpyAPI /texts

/usr/[local]/share would also be an option for these: formally they are not executables but data.

The frontend script tv_grab_nl3.py. I will let the python installer handle that. Under Linux in /usr/bin and under Windows ...

or /usr/local/bin

The configuration files, I'll leave them in ~/.xmltv. But there is something to say about putting them in /etc/tvgrabpyAPI. But then where under Windows? And we need a group to create write access. See also the next item.

Two considerations here: does each user run its own instance or does it run deamonized as a central service? Second consideration: do you want a system wide (default) configuration or not?

In my setup I would run it as central service. For me it would be even convenient to run it under its own user or to run it as the mythtv user (doing that right now). In such cases it would make sense to use /etc/. But if you may have multiple users running its own instance, ~/.xmltv would be a more logical option. You can also make a hybrid here: use ~/.xmltv when present, but fall back to /etc when not known.

Formally I would say: after installation don't change anything to /etc/ without explicit consent of the user. Parts of the configuration that can be changed during normal operation of the script should be stored elsewhere.

The source files, this is tricky. They will mainly get downloaded on need by the user, but they also come with the installation. So they should be in a location where the user has rw permissions. ~/.xmltv/sources would then be logical. But the installation will have to run as root, so how to determine what user. Should we ask during installation? Another logical location could be /var/lib/tvgrabpyAPI. But then we should set up a group for the user to become a member of. Or check on common mythtv, tvheadend,... groups. Another problem is Windows users. There is no /var/lib?

Run as system service: /var/[local]/lib Run as a normal use: ~/.xmltv/sources

Winfried

hikavdh commented 8 years ago

Thanks, is mostly similar to my thoughts. And yes I'm aware of the Filesystem_Hierarchy_Standard, although not intimately. But I am also looking at Windows user! ;-( At present I'm playing with the python distutils, which can create rpm or a windows installer, I'm not aware of setuptools, but can have a look. I go for /usr/bin and not /usr/local/bin or /opt/bin, which two are often the same. First it is the choise of distutils and it is (at least on Gentoo) where the xmltv tools and grabbers are. I haven't jet checked what distutils does on Windows. (We have at least some Windows users). The Windows users are also a reason for me to put the text pickels in the python tree. There you have basically only two locations Program Files and the home directory. It is an interesting thought, that has vaguely been nagging the back of my mind to choose a two way path. Generic /etc and /var/lib ( if the running user is root for instance) and specific the home directory (unless under Windows for there the home directory is probably the only valid option). Then if you run as a user and a ~/.xmltv directory is not present check /etc and /var/lib. Unless of cause they are running --configure. I had problematic users coming by who lost their configuration. They had ran --configure as root through sudo. But the main reason to use the home directory is so that accidentally users running simultaneously won't bite each other through the log or the cache (or the one overwriting the output from the other.;-) ) And that I don't like to go the path of creating a usergroup (or looking for a mythtv, tvheadend, xmltv, ...group). But I'm open to any good argument for any way.

hikavdh commented 8 years ago

Oh Winfried, have you already looked at my new DataTree(Grab) module. I placed it in a separate repository (https://github.com/tvgrabbers/DataTree) with documentation and posted it on pipy. I was almost in love when I wrote it, thus elegant it is! I'll drop the package in the first new alfa. Probably somewhere around the coming weekend.

winfried commented 8 years ago

I am fine with tvgrab running (fully) as user and storing configurations and variable data in ~/.xmltv (or whatever). It would also make installation and configuration straightforaward...

Winfried

winfried commented 8 years ago

On 05/25/2016 09:36 AM, Hika van den Hoven wrote:

Hi Hika,

have you already looked at my new DataTree(Grab) module. I placed it in a separate repository I'll drop the package in the first new alfa. Probably somewhere around the coming weekend.

Nice! I just took a brief look and looks like a piece of elegant code to me. I like mostly the idea of separating the data-extraction language from the core of tvgrabnlpy en documenting it seperately: It makes it lots easier for others to develop definitions for sites.

Winfried

hikavdh commented 8 years ago

I was struggling with JSON, making it way to complicated with constructs like dicted lists. And then I looked at HTML and saw the similarities and went back to basic interpretation. And this is true object orientation, giving the parts some intelligence. I tried to keep all functionality generic, not linked to tv_grab. That and equalizing JSON and HTML. After extraction the lists get linked to key-words into dicts, following a simpler tv_grab oriented data-language and doing some tv_grab related data-validation. This is also in the definition JSON files, in the "values" keyword underneath "data". I got so infatuated with it that I used a similar approach to the data-merging, organizing it in a channel/program node-tree, letting the channel and program nodes handle the merging, gap-filling etc. Also the dat_def language is somewhat extended to define the actual site fetching. I might add that to DataTree, but then I have to move over some of the code.

hikavdh commented 8 years ago

I created and tried this setup.py:

#!/usr/bin/env python2

'''
This Package contains an API for tv_grabbers Alle data is defined in JSON files
Including where and how to get the tv data. Multiple source can be defined
and the resulting data is integrated in to one XMLTV output file.
The detailed behaviour is highly configurable. See our site for more details.'''

from distutils.core import setup
from os import environ, name
from tvgrabpyAPI import version, __version__

if 'HOME' in environ:
    home_dir = environ['HOME']
elif 'HOMEPATH' in environ:
    home_dir = environ['HOMEPATH']

if name == 'nt' and 'USERPROFILE' in environ:
    home_dir = environ['USERPROFILE']
    source_dir = u'%s/.xmltv/sources' % home_dir
else:
    source_dir = u'/var/lib/tvgrabpyAPI'

setup(
    name = version()[0],
    version =  __version__,
    description = 'xlmtv API based on JSON datafiles',
    packages = ['tvgrabpyAPI'],
    package_data={'tvgrabpyAPI': ['texts/tv_grab_text.*']},
    scripts=['tv_grab_nl3.py'],
    data_files=[(source_dir, ['sources/tv_grab_API.json',
                            'sources/tv_grab_nl.json',
                            'sources/source-virtual.nl.json',
                            'sources/source-horizon.tv.json',
                            'sources/source-humo.be.json',
                            'sources/source-nieuwsblad.be.json',
                            'sources/source-npo.nl.json',
                            'sources/source-oorboekje.nl.json',
                            'sources/source-primo.eu.json',
                            'sources/source-rtl.nl.json',
                            'sources/source-tvgids.nl.json',
                            'sources/source-tvgids.tv.json',
                            'sources/source-vpro.nl.json',
                            'sources/source-vrt.be.json'])],
    requires = ['pytz', 'requests', 'DataTreeGrab'],
    provides = ['%s (%s.%s)' % (version()[0], version()[1], version()[2])],
    long_description = __doc__,
    maintainer = 'Hika van den Hoven',
    maintainer_email = 'hikavdh at gmail dot com',
    license='GPL',
    url='https://github.com/tvgrabbers/tvgrabnlpy',
)

It works both under Linux and Windows. Weirdly enough under windows the sources directory is next to to the homedirectory also copied to...\site-packages\tvgrabpyAPI\sources. I'll add code under Linux to check on running as root and will let the script quit unless running with --configure. In the latter case I will switch from using by default ~/.xmltv to /etc/tvgrabpyAPI, giving the directory and files 755/644 rights. Then when running as a normal user I will use /etc as fallback for the configuration. This meaning: ~/.xmltv will be used for everything unless specified different on the command-line, however if there no configuration file is found, a check is made on the presence of /etc/tvgrabpyAPI/tv_grab_nl3.conf, using then that configuration, but placing/using al other files in ~/.xmltv. For the sources I will check on write rights in /var/lib/tvgrabpyAPI and then use that. Else ~/.xmltv/sources, possibly copying over /var/lib/tvgrabpyAPI if ~/.xmltv/sources does not jet exists. I will suggest creating rw access to /var/lib/tvgrabpyAPI.

hikavdh commented 8 years ago

I am still thinking of moving tvgrabpyAPI to its own repository, cutting of the tvgrabnlpy branch. I then also can there work on the English documentation. Ultimately the json-source files will move to sourcematching into a seperate branche. Putting a copy together with DataTreeGrab into every release.

hikavdh commented 8 years ago

I have been working on the json data download routine and I think I leave the source files out off the package. It would only be a fallback and if I save the downloaded files I can add the data-version to the file-name. That way I can check the present version without opening the file and straight go to downloading if a newer is present.

tvgrabbers / tvgrabnlpy

File locations for the new version 3 API #64