geopython / pywps

PyWPS is an implementation of the Web Processing Service standard from the Open Geospatial Consortium. PyWPS is written in Python.
https://pywps.org
MIT License
178 stars 117 forks source link

initial watchdog #505

Closed cehbrecht closed 4 years ago

cehbrecht commented 4 years ago

Overview

This PR introduces a new watchdog module. The code includes the PR #497.

Docs are not updated!

Changes:

I have moved the CLI command line from the birdhouse template to pywps: https://github.com/bird-house/cookiecutter-birdhouse/issues/52

You can start a pywps with the watchdog like this:

$ pywps start -c /path/to/pywps.cfg --use-watchdog

You can also start both services standalone:

$ pywps start -c /path/to/pywps.cfg
$ pywps watchdog -c /path/to/pywps.cfg

The pywps.cfg needs to point to the processes list with a python list, see Emu example: https://github.com/bird-house/emu/blob/6347264a8fd0b86bf6d0b7b5fc857aaf449003cd/pywps.cfg#L10

This is a initial implemention to get the cli working. See also issue #118.

Related Issue / Discussion

497

Additional Information

Contribution Agreement

(as per https://github.com/geopython/pywps/blob/master/CONTRIBUTING.rst#contributions-and-licensing)

cehbrecht commented 4 years ago

@jachym @davidcaron @huard could you please have a look at this PR and give your opinion. You can try it with the Emu on the watchdog branch: https://github.com/bird-house/emu/tree/watchdog

cehbrecht commented 4 years ago

Should the watchdog be started as its own command?

$ watchdog start -c pywps.cfg
coveralls commented 4 years ago

Coverage Status

Coverage decreased (-2.008%) to 73.153% when pulling c8ac7dd28c2aa9424eb92d2fafe7c685979c4779 on cehbrecht:watchdog into 9709866ad6ade37895efbf831eb8579e0b9a36d2 on geopython:master.

cehbrecht commented 4 years ago

We need also a clean task to remove stalled jobs from the running queue.

cehbrecht commented 4 years ago

The scheduler joblauncher should be refactored to also load the job status from the database.

huard commented 4 years ago

Started Emu with pywps start --bind-host 0.0.0.0 -c pywps.cfg --use-watchdog then

from birdy import WPSClient
wps_a = WPSClient("http://localhost:5000/wps", progress=True)                                                                                                                                       
resp_a = wps_a.inout() 

fails with

HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://localhost:5000/wps

No such problem in sync mode.

huard commented 4 years ago

Also, would it be useful to have two logs, one for the daemon, and one for the server? Since the daemon generates a lot of noise in the logs (with DEBUG), it's hard to find the info relevant to the server.

cehbrecht commented 4 years ago

The daemon is sometimes called daemon, sometimes watchdog. I suggest a more descriptive name such as "RequestQueueDaemon".

class name: JobQueueHandler? module name: jobqueue or queue?

There seems to be a mechanism in your PR to have a list of processes in the the configuration file. Is this documented ?

Not yet. It is a quick solution ... I needed this to get the CLI into pywps. See code: https://github.com/cehbrecht/pywps/blob/7821562ef13c755b3b8ffab737a7c8241301308d/pywps/app/Service.py#L59

It is using the python expression of the process list and loads it via importlib.

huard commented 4 years ago

+1

Le mar. 17 déc. 2019 05 h 33, MacPingu notifications@github.com a écrit :

The daemon is sometimes called daemon, sometimes watchdog. I suggest a more descriptive name such as "RequestQueueDaemon".

class name: JobQueueHandler? module name: jobqueue or queue?

There seems to be a mechanism in your PR to have a list of processes in the the configuration file. Is this documented ?

Not yet. It is a quick solution ... I needed this to get the CLI into pywps. See code:

https://github.com/cehbrecht/pywps/blob/7821562ef13c755b3b8ffab737a7c8241301308d/pywps/app/Service.py#L59

It is using the python expression of the process list and loads it via importlib.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/geopython/pywps/pull/505?email_source=notifications&email_token=AAAT2Q2V3N6MI5TDZVOH7ETQZCTJDA5CNFSM4J3MTL6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHB5SXQ#issuecomment-566483294, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAT2Q2RZI5LQPEDG5ODMATQZCTJDANCNFSM4J3MTL6A .

cehbrecht commented 4 years ago

It is still working on my side ... I couldn't reproduce the exception.

Started Emu with pywps start --bind-host 0.0.0.0 -c pywps.cfg --use-watchdog then

from birdy import WPSClient
wps_a = WPSClient("http://localhost:5000/wps", progress=True)                                                                                                                                       
resp_a = wps_a.inout() 

fails with

HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://localhost:5000/wps

No such problem in sync mode.

cehbrecht commented 4 years ago

Also, would it be useful to have two logs, one for the daemon, and one for the server? Since the daemon generates a lot of noise in the logs (with DEBUG), it's hard to find the info relevant to the server.

I'm not sure if I want to have two log files ... but I understand the trouble. As an alternative we could have two different types of logging handlers ... skip the db logs in debug mode ... unless they are wanted. Also the job queue can be triggered less frequently ... default was 30 secs.

cehbrecht commented 4 years ago

@huard Do you have an opinion on the CLI interface?

huard commented 4 years ago

Not really. Apart from watchdog name change, it looks good to me.

cehbrecht commented 4 years ago

@huard @davidcaron renamed watchdog to:

class: pywps.queue.JobQueueService
cli: pywps jobqueue -c pywps.cfg
cehbrecht commented 4 years ago

@huard I have introduced a new pywps.log module to remove code duplication, a config parameter db_echo=false to set database logging independently. The queue only logs in info mode when there is a change in the queue (running, stored).

Probably logging can be improved in a later PR.

tlvu commented 4 years ago

FYI I've built an emu docker image with tag watchdog here https://hub.docker.com/r/tlvu/emu/tags for those who wants to test using a real docker image. I will try to bring this emu image into our PAVICS stack to see how it integrates. Will report back any findings.

Edit:

tlvu commented 4 years ago

@cehbrecht do you know how provide Emu with extra configs?

I did this in my docker-compose.yml where I set the env var PYWPS_CFG=/wps.cfg:

services:
  emu:
    image: ${EMU_IMAGE}
    container_name: emu
    environment:
      - PYWPS_CFG=/wps.cfg
    ports:
      - "8888:5000"
    volumes:
      - ./optional-components/emu/wps.cfg:/wps.cfg
      - wps_outputs:/data/wpsoutputs
      - /tmp
    restart: always

But it seems to ignore my /wps.cfg and only load the default one (/opt/wps/pywps.cfg):

$ docker exec emu cat /opt/wps/pywps.log
2019-12-18 22:56:11,703] [INFO] line=64 module=configuration loading configuration
2019-12-18 22:56:11,704] [INFO] line=151 module=configuration Configuration file(s) ['/opt/wps/pywps.cfg'] loaded
2019-12-18 22:56:12,008] [INFO] line=64 module=configuration loading configuration
2019-12-18 22:56:12,010] [INFO] line=151 module=configuration Configuration file(s) ['/opt/wps/pywps.cfg'] loaded
tlvu commented 4 years ago

I also have another bird (Raven) on my system and the logs confirms it loads multiple config files, one of them is /wps.cfg which is added by the env var PYWPS_CFG=/wps.cfg the same way as Emu:

$ docker logs raven
INFO: configuration.load_configuration(): loading configuration
INFO: configuration.load_configuration(): Configuration file(s) ['/opt/wps/raven/default.cfg', '/opt/wps/.custom.cfg', '/opt/wps/etc/demo.cfg', '/wps.cfg'] loaded
starting WPS service on http://localhost:9099/wps
INFO: _internal._log():  * Running on http://0.0.0.0:9099/ (Press CTRL+C to quit)
cehbrecht commented 4 years ago

@tlvu do you start the emu with:

pywps -c /wps.cfg

It will set the env PYWPS_CFG=/wps.cfg and it should be the only config used. Also the config needs to point to the processes list: https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/pywps.cfg#L10

This is just an intermediate step to get things working.

tlvu commented 4 years ago

@cehbrecht

@tlvu do you start the emu with:

pywps -c /wps.cfg

I built the Emu docker image so the start command is the one from the Dockerfile https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/Dockerfile#L28

It will set the env PYWPS_CFG=/wps.cfg and it should be the only config used. Also the config needs to point to the processes list: https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/pywps.cfg#L10

This is just an intermediate step to get things working.

Again, the default config is from the docker image so should be the one you referred to.

You can try my docker image here: tlvu/emu:watchdog.

This probably do not block me from testing the watchdog feature but I was simply very surprised that Emu seems to not load multiple configs anymore, like all the other birds.

Note I am taking a sick day today so won't be monitoring my messages.

cehbrecht commented 4 years ago

@tlvu currently you can only provide a single config. It will be set by the cli module to PYWPS_CFG: https://github.com/cehbrecht/pywps/blob/e6d4a2873a32ee0b0ee44b1703ff251452468a89/pywps/cli.py#L62 https://github.com/cehbrecht/pywps/blob/e6d4a2873a32ee0b0ee44b1703ff251452468a89/pywps/configuration.py#L146

This should be changed when fixing the configuration.

I tried the docker-compose in the emu and it was at least starting and responding. When building it don't forget the clean ... otherwise it will copy the existing database.

make clean
docker-compose build
docker-compose up
jachym commented 4 years ago

Should the watchdog be started as its own command?

$ watchdog start -c pywps.cfg

I think this does not matter.

something like pywps-deamon would be maybe better, then neutral and not much informative watchdog ..

jachym commented 4 years ago

I have created pull request for this pull request :-) https://github.com/cehbrecht/pywps/pull/1 @cehbrecht I hope, it looks good for you

cehbrecht commented 4 years ago

@jachym @tlvu @davidcaron I have merged the alembic PR by Jachym and updated the command line to initalize the database. You need to run migrate before first start and on updates:

$ pywps migrate -c pywps.cfg
cehbrecht commented 4 years ago

... ohh ... tests are failing ... no test DB.

cehbrecht commented 4 years ago

@tlvu @jachym I have changed the command line ... --config option is now part of pywps command (not sub-command):

$ pywps -c pywps.cfg migrate
$ pywps -c pywps.cfg start
$ pywps -c pywps.cfg jobqueue
cehbrecht commented 4 years ago

@jachym @davidcaron Should we change the config for the database settings in this PR or later? See issue #495

cehbrecht commented 4 years ago

@jachym @davidcaron We also need a "cancel" function for stalled jobs to clean the process queue. I already did some prototyping on it ... but I would prefer to get it in with a new PR. See also issues #448 and #417.

jachym commented 4 years ago

So, are we good?

cehbrecht commented 4 years ago

I would drop Python 2.7 before release 4.4. But that should be a quick change. Maybe we also like to change the database config.

I'm ok with merging. Wait for feedback by: @tlvu @davidcaron @elemoine (?)

So, are we good?

* [ ]  Shall we merge?

* [ ]  Release 4.4?
davidcaron commented 4 years ago

Sorry, I was still in holiday break.

@cehbrecht For the database config, you mean changing the default from sqlite in memory to a file-based sqlite? If so, I think this release is a good time to do it, whether in this PR or in another.

@jachym good catch for the database migration.

I have a suggestion also: considering that the command to start the service changes (new watchdog command to run jobs, migration is necessary on first start and on update) maybe it should be version 5 and not 4.4? There may not be 'enough' features to make a major version, but as a user of the library, I wouldn't expect breaking changes from 4.2 to 4.4 if we stick with Semantic versioning.

Apart from that, I haven't tested extensively, but the PR looks good to me.

cehbrecht commented 4 years ago

@cehbrecht For the database config, you mean changing the default from sqlite in memory to a file-based sqlite? If so, I think this release is a good time to do it, whether in this PR or in another.

I had to check myself ... I meant to move the database config to a new config section, like mentioned in #495

As a side note, IMO feel like logging.database is a bit misleading when this is not really specific to logging, probably should be server.database. My 2c anyway.

For the version number (4.4 or 5.0) I would leave it to @jachym. But this will be done after the merge.

tlvu commented 4 years ago

I've deployed the emu test service on branch watchdog and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.

2020-01-08-162212_801x546_scrot

Matching logs:

$ docker exec emu tail /opt/wps/pywps.log
2020-01-08 21:24:21,498] [INFO] line=316 module=Service Request: getcapabilities
2020-01-08 21:24:22,025] [INFO] line=316 module=Service Request: describeprocess
2020-01-08 21:24:26,939] [INFO] line=316 module=Service Request: execute
2020-01-08 21:24:51,016] [INFO] line=37 module=queue PyWPS job queue: 0 running processes 1 stored requests
2020-01-08 21:24:51,031] [INFO] line=29 module=__init__ Processing mode: default
2020-01-08 21:24:51,123] [INFO] line=43 module=wps_say_hello say hello
2020-01-08 21:24:51,156] [INFO] line=266 module=Process Removing temporary working directory: /tmp/pywps_process_cfy81nlz
2020-01-08 21:25:21,086] [INFO] line=37 module=queue PyWPS job queue: 0 running processes 0 stored requests

You can access this notebook and reproduce this error here https://pavics.ouranos.ca/jupyter/user/public/lab/tree/users/lvu/test-pywps-watchdog/emu.ipynb.

The docker image is tlvu/emu:watchdog-2257d76-20200108.

Just for completeness, here is the working run in sync mode:

2020-01-08-163202_656x326_scrot

Matching logs:

$ docker exec emu tail -f /opt/wps/pywps.log
2020-01-08 21:31:30,589] [INFO] line=316 module=Service Request: getcapabilities
2020-01-08 21:31:31,006] [INFO] line=316 module=Service Request: describeprocess
2020-01-08 21:31:33,565] [INFO] line=316 module=Service Request: execute
2020-01-08 21:31:33,658] [INFO] line=43 module=wps_say_hello say hello
2020-01-08 21:31:33,685] [INFO] line=266 module=Process Removing temporary working directory: /tmp/pywps_process_ywvtght3
cehbrecht commented 4 years ago

I've deployed the emu test service on branch watchdog and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.

@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:

line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
cehbrecht commented 4 years ago

@tlvu The Python 3.8 failure is a different subject (see issue #508). For Python 3.6/3.7 it works ... but sometimes it does not. I have not yet figured out why.

I've deployed the emu test service on branch watchdog and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.

@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:

line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
cehbrecht commented 4 years ago

@jachym I have tried to update an existing pywps database (master) but this fails because the pywps_stored_requests table was already there:

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) table pywps_stored_requests already exists

Can we fix this in the upgrade script?

Also pywps has the possibility to configure a table prefix: https://github.com/cehbrecht/pywps/blob/935197d2750a768788d9450a112ffa2bc5653a39/pywps/dblog.py#L53

Do we want to keep this? It will not be part of the upgrade script: https://github.com/cehbrecht/pywps/blob/935197d2750a768788d9450a112ffa2bc5653a39/pywps/alembic/versions/eead25e2ad3a_create_tables.py#L34

cehbrecht commented 4 years ago

@tlvu ... when ever I try to reproduce the failure everything works :( I thought something with the database migration went wrong. What happens if you start with a clean database?

@tlvu The Python 3.8 failure is a different subject (see issue #508). For Python 3.6/3.7 it works ... but sometimes it does not. I have not yet figured out why.

I've deployed the emu test service on branch watchdog and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.

@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:

line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
tlvu commented 4 years ago

@tlvu ... when ever I try to reproduce the failure everything works :( I thought something with the database migration went wrong. What happens if you start with a clean database?

@cehbrecht emu on my instance (docker container) was starting from a clean database because the default DB is sqlite which is inside the container and not persisted outside the container.

I see you pin python 3.6 in emu. I"ll get another docker build of emu and test that out later.

Note my broken emu image is tlvu/emu:watchdog-2257d76-20200108 if you want to try re-producing inside a clean docker container without any DB migration.

cehbrecht commented 4 years ago

@jachym I have the feeling we can move the pywps configuration in the alembic env.py: https://github.com/cehbrecht/pywps/blob/1d0d3d86e193b848cb77465557ea40831f27740d/pywps/alembic/env.py#L13

... to the migrate command: https://github.com/cehbrecht/pywps/blob/1d0d3d86e193b848cb77465557ea40831f27740d/pywps/cli.py#L123

Using alembic.config.Config: https://alembic.sqlalchemy.org/en/latest/api/config.html

I need to pass the pywps.cfg files. Or I can use the alembic config just to pass pywps.cfg files and load them in alembic env.py.

cehbrecht commented 4 years ago

Note my broken emu image is tlvu/emu:watchdog-2257d76-20200108 if you want to try re-producing inside a clean docker container without any DB migration.

@tlvu I tried the image locally and it worked for me.

tlvu commented 4 years ago

@tlvu I tried the image locally and it worked for me.

@cehbrecht you mean the async worked? That's odd.

Can you provided me with steps to reproduce on my side?

cehbrecht commented 4 years ago

@tlvu I tried the image locally and it worked for me.

@cehbrecht you mean the async worked? That's odd.

Can you provided me with steps to reproduce on my side?

I didn't do anything special:

docker run -p 5000:5000 tlvu/emu:watchdog-2257d76-20200108 

Then I was starting several hello processes in async mode: http://localhost:5000/wps?service=wps&version=1.0.0&request=execute&identifier=hello&datainputs=name=stranger&status=true&storeExecuteResponse=true

Checked the output of the last started process and it was ending successfully.

tlvu commented 4 years ago

@cehbrecht

I found it, the statusLocation was badly configured in my emu instance since I was not able to override it using PYWPS_CFG.

2020-01-13-114851_686x573_scrot

So I went into the container and manually change the config file and async is working

2020-01-13-122440_678x381_scrot

So I think this PR is good to go. I haven't tested yet your new commit that allow override using PYWPS_CFG but it will surely be immensely useful.

tlvu commented 4 years ago

Oh by the way, how do I see the list of queued jobs? And is there a way to delete a queued job?

tlvu commented 4 years ago

I've build tlvu/emu:watchdog-0b36f1b-20200113 with python 3.6 pinned (don't need anymore) and pywps containing the new commit that allow override using PYWPS_CFG.

So my emu instance is behaving correctly against async request because the statusLocation can be properly set.

However in the logs, it does not seem to be reading multiple config files as expected (the default one, then the one specified by PYWPS_CFG) but only the one file specified by PYWPS_CFG. Is this just a missing proper logs problem?

$ docker exec emu head  /opt/wps/pywps.log
2020-01-13 17:40:31,569] [INFO] line=64 module=configuration loading configuration
2020-01-13 17:40:31,570] [DEBUG] line=67 module=configuration setting default values
2020-01-13 17:40:31,570] [DEBUG] line=186 module=configuration trying to estimate the default location
2020-01-13 17:40:31,571] [INFO] line=151 module=configuration Configuration file(s) ['/wps.cfg'] loaded
2020-01-13 17:40:31,814] [DEBUG] line=27 module=literalvalidator validating literal value.
2020-01-13 17:40:31,814] [DEBUG] line=32 module=literalvalidator validation result: True
cehbrecht commented 4 years ago

@jachym I have hard-coded the default database prefix: https://github.com/geopython/pywps/pull/505/commits/2099b6e4494923eb9b4c5284b079519f295a4e47

Is this ok?

cehbrecht commented 4 years ago

@tlvu I tried the configuration overwrite locally with emu. It worked for me and all the configs are written to the log file:

2020-01-14 14:09:39,986] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:39,987] [INFO] line=151 module=configuration Configuration file(s) ['/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded
2020-01-14 14:09:40,096] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:40,097] [INFO] line=151 module=configuration Configuration file(s) ['pywps.cfg', '/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded
2020-01-14 14:09:40,098] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:40,104] [INFO] line=151 module=configuration Configuration file(s) ['pywps.cfg', '/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded

There is still one config init step with the emu application: https://github.com/bird-house/emu/blob/0b36f1baabaa33c5e16a60438e333d2a44c410b1/emu/__init__.py#L9

The app gets initialized without having the configs from the command line.

There were also config init steps with the command line config files in pywps. I have disabled them: https://github.com/geopython/pywps/commit/2099b6e4494923eb9b4c5284b079519f295a4e47

cehbrecht commented 4 years ago

@tlvu

Oh by the way, how do I see the list of queued jobs? And is there a way to delete a queued job?

I want to add the cancel function with a follow-up PR.

tlvu commented 4 years ago

@tlvu I tried the configuration overwrite locally with emu. It worked for me and all the configs are written to the log file:

@cehbrecht I think I understood why the emu docker image did not find any default config files:

https://github.com/cehbrecht/pywps/blob/2099b6e4494923eb9b4c5284b079519f295a4e47/pywps/configuration.py#L178-L183

  1. PYWPS_INSTALLATION/etc/pywps.cfg
  2. /etc/pywps.cfg
  3. $HOME/.pywps.cfg

https://github.com/bird-house/emu/blob/0b36f1baabaa33c5e16a60438e333d2a44c410b1/Dockerfile#L31 Default config file in emu docker image: /opt/wps/pywps.cfg

As we see, the default config file inside the emu docker image (/opt/wps/pywps.cfg) is not in any of the 3 locations above. Should we modify emu docker image to put the default config file in 1 of those 3 locations or add more default locations?

cehbrecht commented 4 years ago

As we see, the default config file inside the emu docker image (/opt/wps/pywps.cfg) is not in any of the 3 locations above. Should we modify emu docker image to put the default config file in 1 of those 3 locations or add more default locations?

@tlvu we may want to add an additional default location for searching in the app path. Maybe we can do this in another PR? I think we have to change a bit more for this. Would it be ok to use /etc/pywps.cfg as an intermediate step? You only have on pywps in your container.