Closed cehbrecht closed 4 years ago
@jachym @davidcaron @huard could you please have a look at this PR and give your opinion. You can try it with the Emu on the watchdog
branch:
https://github.com/bird-house/emu/tree/watchdog
Should the watchdog
be started as its own command?
$ watchdog start -c pywps.cfg
We need also a clean
task to remove stalled jobs from the running
queue.
The scheduler joblauncher
should be refactored to also load the job status from the database.
Started Emu with pywps start --bind-host 0.0.0.0 -c pywps.cfg --use-watchdog
then
from birdy import WPSClient
wps_a = WPSClient("http://localhost:5000/wps", progress=True)
resp_a = wps_a.inout()
fails with
HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://localhost:5000/wps
No such problem in sync mode.
Also, would it be useful to have two logs, one for the daemon, and one for the server? Since the daemon generates a lot of noise in the logs (with DEBUG), it's hard to find the info relevant to the server.
The daemon is sometimes called daemon, sometimes watchdog. I suggest a more descriptive name such as "RequestQueueDaemon".
class name: JobQueueHandler
?
module name: jobqueue or queue?
There seems to be a mechanism in your PR to have a list of processes in the the configuration file. Is this documented ?
Not yet. It is a quick solution ... I needed this to get the CLI into pywps. See code: https://github.com/cehbrecht/pywps/blob/7821562ef13c755b3b8ffab737a7c8241301308d/pywps/app/Service.py#L59
It is using the python expression of the process list and loads it via importlib
.
+1
Le mar. 17 déc. 2019 05 h 33, MacPingu notifications@github.com a écrit :
The daemon is sometimes called daemon, sometimes watchdog. I suggest a more descriptive name such as "RequestQueueDaemon".
class name: JobQueueHandler? module name: jobqueue or queue?
There seems to be a mechanism in your PR to have a list of processes in the the configuration file. Is this documented ?
Not yet. It is a quick solution ... I needed this to get the CLI into pywps. See code:
It is using the python expression of the process list and loads it via importlib.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/geopython/pywps/pull/505?email_source=notifications&email_token=AAAT2Q2V3N6MI5TDZVOH7ETQZCTJDA5CNFSM4J3MTL6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHB5SXQ#issuecomment-566483294, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAT2Q2RZI5LQPEDG5ODMATQZCTJDANCNFSM4J3MTL6A .
It is still working on my side ... I couldn't reproduce the exception.
Started Emu with
pywps start --bind-host 0.0.0.0 -c pywps.cfg --use-watchdog
thenfrom birdy import WPSClient wps_a = WPSClient("http://localhost:5000/wps", progress=True) resp_a = wps_a.inout()
fails with
HTTPError: 500 Server Error: INTERNAL SERVER ERROR for url: http://localhost:5000/wps
No such problem in sync mode.
Also, would it be useful to have two logs, one for the daemon, and one for the server? Since the daemon generates a lot of noise in the logs (with DEBUG), it's hard to find the info relevant to the server.
I'm not sure if I want to have two log files ... but I understand the trouble. As an alternative we could have two different types of logging handlers ... skip the db logs in debug mode ... unless they are wanted. Also the job queue can be triggered less frequently ... default was 30 secs.
@huard Do you have an opinion on the CLI interface?
Not really. Apart from watchdog name change, it looks good to me.
@huard @davidcaron renamed watchdog to:
class: pywps.queue.JobQueueService
cli: pywps jobqueue -c pywps.cfg
@huard I have introduced a new pywps.log
module to remove code duplication, a config parameter db_echo=false
to set database logging independently. The queue only logs in info mode when there is a change in the queue (running, stored).
Probably logging can be improved in a later PR.
FYI I've built an emu docker image with tag watchdog
here https://hub.docker.com/r/tlvu/emu/tags for those who wants to test using a real docker image. I will try to bring this emu image into our PAVICS stack to see how it integrates. Will report back any findings.
Edit:
@cehbrecht do you know how provide Emu with extra configs?
I did this in my docker-compose.yml
where I set the env var PYWPS_CFG=/wps.cfg
:
services:
emu:
image: ${EMU_IMAGE}
container_name: emu
environment:
- PYWPS_CFG=/wps.cfg
ports:
- "8888:5000"
volumes:
- ./optional-components/emu/wps.cfg:/wps.cfg
- wps_outputs:/data/wpsoutputs
- /tmp
restart: always
But it seems to ignore my /wps.cfg
and only load the default one (/opt/wps/pywps.cfg
):
$ docker exec emu cat /opt/wps/pywps.log
2019-12-18 22:56:11,703] [INFO] line=64 module=configuration loading configuration
2019-12-18 22:56:11,704] [INFO] line=151 module=configuration Configuration file(s) ['/opt/wps/pywps.cfg'] loaded
2019-12-18 22:56:12,008] [INFO] line=64 module=configuration loading configuration
2019-12-18 22:56:12,010] [INFO] line=151 module=configuration Configuration file(s) ['/opt/wps/pywps.cfg'] loaded
I also have another bird (Raven) on my system and the logs confirms it loads multiple config files, one of them is /wps.cfg
which is added by the env var PYWPS_CFG=/wps.cfg
the same way as Emu:
$ docker logs raven
INFO: configuration.load_configuration(): loading configuration
INFO: configuration.load_configuration(): Configuration file(s) ['/opt/wps/raven/default.cfg', '/opt/wps/.custom.cfg', '/opt/wps/etc/demo.cfg', '/wps.cfg'] loaded
starting WPS service on http://localhost:9099/wps
INFO: _internal._log(): * Running on http://0.0.0.0:9099/ (Press CTRL+C to quit)
@tlvu do you start the emu with:
pywps -c /wps.cfg
It will set the env PYWPS_CFG=/wps.cfg
and it should be the only config used. Also the config needs to point to the processes list:
https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/pywps.cfg#L10
This is just an intermediate step to get things working.
@cehbrecht
@tlvu do you start the emu with:
pywps -c /wps.cfg
I built the Emu docker image so the start command is the one from the Dockerfile https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/Dockerfile#L28
It will set the env
PYWPS_CFG=/wps.cfg
and it should be the only config used. Also the config needs to point to the processes list: https://github.com/bird-house/emu/blob/2a6bac16433f22bad96241e9d1847b7572a8e776/pywps.cfg#L10This is just an intermediate step to get things working.
Again, the default config is from the docker image so should be the one you referred to.
You can try my docker image here: tlvu/emu:watchdog
.
This probably do not block me from testing the watchdog feature but I was simply very surprised that Emu seems to not load multiple configs anymore, like all the other birds.
Note I am taking a sick day today so won't be monitoring my messages.
@tlvu currently you can only provide a single config. It will be set by the cli
module to PYWPS_CFG
:
https://github.com/cehbrecht/pywps/blob/e6d4a2873a32ee0b0ee44b1703ff251452468a89/pywps/cli.py#L62
https://github.com/cehbrecht/pywps/blob/e6d4a2873a32ee0b0ee44b1703ff251452468a89/pywps/configuration.py#L146
This should be changed when fixing the configuration.
I tried the docker-compose in the emu and it was at least starting and responding. When building it don't forget the clean ... otherwise it will copy the existing database.
make clean
docker-compose build
docker-compose up
Should the
watchdog
be started as its own command?$ watchdog start -c pywps.cfg
I think this does not matter.
something like pywps-deamon
would be maybe better, then neutral and not much informative watchdog
..
I have created pull request for this pull request :-) https://github.com/cehbrecht/pywps/pull/1 @cehbrecht I hope, it looks good for you
@jachym @tlvu @davidcaron I have merged the alembic PR by Jachym and updated the command line to initalize the database. You need to run migrate
before first start and on updates:
$ pywps migrate -c pywps.cfg
... ohh ... tests are failing ... no test DB.
@tlvu @jachym I have changed the command line ... --config
option is now part of pywps command (not sub-command):
$ pywps -c pywps.cfg migrate
$ pywps -c pywps.cfg start
$ pywps -c pywps.cfg jobqueue
@jachym @davidcaron Should we change the config for the database settings in this PR or later? See issue #495
@jachym @davidcaron We also need a "cancel" function for stalled jobs to clean the process queue. I already did some prototyping on it ... but I would prefer to get it in with a new PR. See also issues #448 and #417.
So, are we good?
I would drop Python 2.7 before release 4.4. But that should be a quick change. Maybe we also like to change the database config.
I'm ok with merging. Wait for feedback by: @tlvu @davidcaron @elemoine (?)
So, are we good?
* [ ] Shall we merge? * [ ] Release 4.4?
Sorry, I was still in holiday break.
@cehbrecht For the database config, you mean changing the default from sqlite in memory to a file-based sqlite? If so, I think this release is a good time to do it, whether in this PR or in another.
@jachym good catch for the database migration.
I have a suggestion also: considering that the command to start the service changes (new watchdog command to run jobs, migration is necessary on first start and on update) maybe it should be version 5 and not 4.4? There may not be 'enough' features to make a major version, but as a user of the library, I wouldn't expect breaking changes from 4.2 to 4.4 if we stick with Semantic versioning.
Apart from that, I haven't tested extensively, but the PR looks good to me.
@cehbrecht For the database config, you mean changing the default from sqlite in memory to a file-based sqlite? If so, I think this release is a good time to do it, whether in this PR or in another.
I had to check myself ... I meant to move the database config to a new config section, like mentioned in #495
As a side note, IMO feel like logging.database is a bit misleading when this is not really specific to logging, probably should be server.database. My 2c anyway.
For the version number (4.4 or 5.0) I would leave it to @jachym. But this will be done after the merge.
I've deployed the emu test service on branch watchdog
and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.
Matching logs:
$ docker exec emu tail /opt/wps/pywps.log
2020-01-08 21:24:21,498] [INFO] line=316 module=Service Request: getcapabilities
2020-01-08 21:24:22,025] [INFO] line=316 module=Service Request: describeprocess
2020-01-08 21:24:26,939] [INFO] line=316 module=Service Request: execute
2020-01-08 21:24:51,016] [INFO] line=37 module=queue PyWPS job queue: 0 running processes 1 stored requests
2020-01-08 21:24:51,031] [INFO] line=29 module=__init__ Processing mode: default
2020-01-08 21:24:51,123] [INFO] line=43 module=wps_say_hello say hello
2020-01-08 21:24:51,156] [INFO] line=266 module=Process Removing temporary working directory: /tmp/pywps_process_cfy81nlz
2020-01-08 21:25:21,086] [INFO] line=37 module=queue PyWPS job queue: 0 running processes 0 stored requests
You can access this notebook and reproduce this error here https://pavics.ouranos.ca/jupyter/user/public/lab/tree/users/lvu/test-pywps-watchdog/emu.ipynb.
The docker image is tlvu/emu:watchdog-2257d76-20200108
.
Just for completeness, here is the working run in sync mode:
Matching logs:
$ docker exec emu tail -f /opt/wps/pywps.log
2020-01-08 21:31:30,589] [INFO] line=316 module=Service Request: getcapabilities
2020-01-08 21:31:31,006] [INFO] line=316 module=Service Request: describeprocess
2020-01-08 21:31:33,565] [INFO] line=316 module=Service Request: execute
2020-01-08 21:31:33,658] [INFO] line=43 module=wps_say_hello say hello
2020-01-08 21:31:33,685] [INFO] line=266 module=Process Removing temporary working directory: /tmp/pywps_process_ywvtght3
I've deployed the emu test service on branch
watchdog
and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.
@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:
line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
@tlvu The Python 3.8 failure is a different subject (see issue #508). For Python 3.6/3.7 it works ... but sometimes it does not. I have not yet figured out why.
I've deployed the emu test service on branch
watchdog
and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:
line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
@jachym I have tried to update an existing pywps database (master) but this fails because the pywps_stored_requests
table was already there:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) table pywps_stored_requests already exists
Can we fix this in the upgrade script?
Also pywps has the possibility to configure a table prefix: https://github.com/cehbrecht/pywps/blob/935197d2750a768788d9450a112ffa2bc5653a39/pywps/dblog.py#L53
Do we want to keep this? It will not be part of the upgrade script: https://github.com/cehbrecht/pywps/blob/935197d2750a768788d9450a112ffa2bc5653a39/pywps/alembic/versions/eead25e2ad3a_create_tables.py#L34
@tlvu ... when ever I try to reproduce the failure everything works :( I thought something with the database migration went wrong. What happens if you start with a clean database?
@tlvu The Python 3.8 failure is a different subject (see issue #508). For Python 3.6/3.7 it works ... but sometimes it does not. I have not yet figured out why.
I've deployed the emu test service on branch
watchdog
and it seems like the async mode is not working for me, see the following jupyterlab screenshot. No problem in sync mode.@tlvu I can confirm. Looks like a python version issue. Seems to work in Python 3.6. In Python 3.8 I had the following exception in the log file:
line=45 module=queue PyWPS job queue failed: Can't pickle <function do_first at 0x1061484c0>: it's not the same object as jinja2.filters.do_first
@tlvu ... when ever I try to reproduce the failure everything works :( I thought something with the database migration went wrong. What happens if you start with a clean database?
@cehbrecht emu on my instance (docker container) was starting from a clean database because the default DB is sqlite which is inside the container and not persisted outside the container.
I see you pin python 3.6 in emu. I"ll get another docker build of emu and test that out later.
Note my broken emu image is tlvu/emu:watchdog-2257d76-20200108
if you want to try re-producing inside a clean docker container without any DB migration.
@jachym I have the feeling we can move the pywps configuration in the alembic env.py
:
https://github.com/cehbrecht/pywps/blob/1d0d3d86e193b848cb77465557ea40831f27740d/pywps/alembic/env.py#L13
... to the migrate command: https://github.com/cehbrecht/pywps/blob/1d0d3d86e193b848cb77465557ea40831f27740d/pywps/cli.py#L123
Using alembic.config.Config
:
https://alembic.sqlalchemy.org/en/latest/api/config.html
I need to pass the pywps.cfg
files. Or I can use the alembic config just to pass pywps.cfg
files and load them in alembic env.py
.
Note my broken emu image is tlvu/emu:watchdog-2257d76-20200108 if you want to try re-producing inside a clean docker container without any DB migration.
@tlvu I tried the image locally and it worked for me.
@tlvu I tried the image locally and it worked for me.
@cehbrecht you mean the async worked? That's odd.
Can you provided me with steps to reproduce on my side?
@tlvu I tried the image locally and it worked for me.
@cehbrecht you mean the async worked? That's odd.
Can you provided me with steps to reproduce on my side?
I didn't do anything special:
docker run -p 5000:5000 tlvu/emu:watchdog-2257d76-20200108
Then I was starting several hello
processes in async mode:
http://localhost:5000/wps?service=wps&version=1.0.0&request=execute&identifier=hello&datainputs=name=stranger&status=true&storeExecuteResponse=true
Checked the output of the last started process and it was ending successfully.
@cehbrecht
I found it, the statusLocation was badly configured in my emu instance since I was not able to override it using PYWPS_CFG
.
So I went into the container and manually change the config file and async is working
So I think this PR is good to go. I haven't tested yet your new commit that allow override using PYWPS_CFG
but it will surely be immensely useful.
Oh by the way, how do I see the list of queued jobs? And is there a way to delete a queued job?
I've build tlvu/emu:watchdog-0b36f1b-20200113
with python 3.6 pinned (don't need anymore) and pywps containing the new commit that allow override using PYWPS_CFG
.
So my emu instance is behaving correctly against async request because the statusLocation can be properly set.
However in the logs, it does not seem to be reading multiple config files as expected (the default one, then the one specified by PYWPS_CFG
) but only the one file specified by PYWPS_CFG
. Is this just a missing proper logs problem?
$ docker exec emu head /opt/wps/pywps.log
2020-01-13 17:40:31,569] [INFO] line=64 module=configuration loading configuration
2020-01-13 17:40:31,570] [DEBUG] line=67 module=configuration setting default values
2020-01-13 17:40:31,570] [DEBUG] line=186 module=configuration trying to estimate the default location
2020-01-13 17:40:31,571] [INFO] line=151 module=configuration Configuration file(s) ['/wps.cfg'] loaded
2020-01-13 17:40:31,814] [DEBUG] line=27 module=literalvalidator validating literal value.
2020-01-13 17:40:31,814] [DEBUG] line=32 module=literalvalidator validation result: True
@jachym I have hard-coded the default database prefix: https://github.com/geopython/pywps/pull/505/commits/2099b6e4494923eb9b4c5284b079519f295a4e47
Is this ok?
@tlvu I tried the configuration overwrite locally with emu. It worked for me and all the configs are written to the log file:
2020-01-14 14:09:39,986] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:39,987] [INFO] line=151 module=configuration Configuration file(s) ['/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded
2020-01-14 14:09:40,096] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:40,097] [INFO] line=151 module=configuration Configuration file(s) ['pywps.cfg', '/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded
2020-01-14 14:09:40,098] [INFO] line=64 module=configuration loading configuration
2020-01-14 14:09:40,104] [INFO] line=151 module=configuration Configuration file(s) ['pywps.cfg', '/Users/pingu/Documents/GitHub/birdhouse/emu/test.cfg'] loaded
There is still one config init step with the emu application: https://github.com/bird-house/emu/blob/0b36f1baabaa33c5e16a60438e333d2a44c410b1/emu/__init__.py#L9
The app gets initialized without having the configs from the command line.
There were also config init steps with the command line config files in pywps. I have disabled them: https://github.com/geopython/pywps/commit/2099b6e4494923eb9b4c5284b079519f295a4e47
@tlvu
Oh by the way, how do I see the list of queued jobs? And is there a way to delete a queued job?
I want to add the cancel function with a follow-up PR.
@tlvu I tried the configuration overwrite locally with emu. It worked for me and all the configs are written to the log file:
@cehbrecht I think I understood why the emu docker image did not find any default config files:
PYWPS_INSTALLATION/etc/pywps.cfg
/etc/pywps.cfg
$HOME/.pywps.cfg
https://github.com/bird-house/emu/blob/0b36f1baabaa33c5e16a60438e333d2a44c410b1/Dockerfile#L31
Default config file in emu docker image: /opt/wps/pywps.cfg
As we see, the default config file inside the emu docker image (/opt/wps/pywps.cfg) is not in any of the 3 locations above. Should we modify emu docker image to put the default config file in 1 of those 3 locations or add more default locations?
As we see, the default config file inside the emu docker image (/opt/wps/pywps.cfg) is not in any of the 3 locations above. Should we modify emu docker image to put the default config file in 1 of those 3 locations or add more default locations?
@tlvu we may want to add an additional default location for searching in the app path. Maybe we can do this in another PR? I think we have to change a bit more for this. Would it be ok to use /etc/pywps.cfg
as an intermediate step? You only have on pywps in your container.
Overview
This PR introduces a new
watchdog
module. The code includes the PR #497.Docs are not updated!
Changes:
os.environ
PYWPS_CFG
.I have moved the CLI command line from the birdhouse template to pywps: https://github.com/bird-house/cookiecutter-birdhouse/issues/52
You can start a pywps with the watchdog like this:
You can also start both services standalone:
The
pywps.cfg
needs to point to the processes list with a python list, see Emu example: https://github.com/bird-house/emu/blob/6347264a8fd0b86bf6d0b7b5fc857aaf449003cd/pywps.cfg#L10This is a initial implemention to get the cli working. See also issue #118.
Related Issue / Discussion
497
Additional Information
Contribution Agreement
(as per https://github.com/geopython/pywps/blob/master/CONTRIBUTING.rst#contributions-and-licensing)