geopython / pywps

PyWPS is an implementation of the Web Processing Service standard from the Open Geospatial Consortium. PyWPS is written in Python.
https://pywps.org
MIT License
175 stars 117 forks source link

Improve database management, files storage management and sub-process management #662

Open gschwind opened 2 years ago

gschwind commented 2 years ago

Overview

This patch series include many change to have a even better handle of sub-process. The patch series include:

Missing peace:

More detail on patches series

Storage Back-end rewrite: The rewrite of back-end storage, simplify the API of back-end storage by only allowing read and write and does not provide particular API for one or another Storage type (for example I removed the copy/move optimisation). Now files in the back-end storage may be exported. If they are exported they can be downloaded via /files endpoint using it's identifier. I did not implemented the S3 back-end because I can't test it, but I think there is nothing that blocking the implementation of the new API.

Multiple end-point: This allow to implement dynamic status, and to manage files output via Storage back-end. There is no more direct file access, all files download goes through /files URL.

New locks scheme for databases. The current lock scheme does not work, the filelock scheme is 100% safe in some setup, in particular one WPS server that access to the database. The filelock will not work if some process that is not aware of file lock try to access to the database or if the database is shared within several sever. I did not found a 'standard' lock method that is shared accross several database back-end thus I implemented this one.

I currently use this implementation on our server.

Best regards

Related Issue / Discussion

Additional Information

This contribution is supported by MINES ParisTech.

Contribution Agreement

(as per https://github.com/geopython/pywps/blob/master/CONTRIBUTING.rst#contributions-and-licensing)

coveralls commented 2 years ago

Coverage Status

Coverage decreased (-0.07%) to 80.978% when pulling f6fa83f2afe5167a8758486a941ede2fb6adf254 on gschwind:fix-status-failed-pull-request into 85ca8191f1440e5906210408d2fa48c41f0ca679 on geopython:main.

huard commented 1 year ago

I tried testing with https://github.com/bird-house/emu (a test and demo servier) and all processes failed.

gschwind commented 1 year ago

I will have a look,

Thanks for feed back :)

gschwind commented 1 year ago

Hello, after a first look I gather two issue.

The first one is fixed by gschwind/pywps-emu@6e4e3aa394cfec59a61db1505d696affeff2c383 and it's due to change that make request url ends with /wps when wps is made, to separate /files or /status request. I may change that by using service query parameter instead, as exemple.

The second issue that I found is related to the new Storage handle, I do not store file immediately in the storage when we use code such as response.outputs["output"].file = "/some/path". and it's seems that during test temporary file vanish before I try to put them into the store. If I put the into a directory that do not vanish it's work. In test outputs files are stored in process.workdir, if I put them directly in "/tmp" every thing goes well. I will investigate futher.

gschwind commented 1 year ago

Hello,

I did fixed more test regarding previous comment here : https://github.com/gschwind/pywps-emu/commits/update-test

I have also some update to fix outputs storage, but I need to fix issue for inputs. I will try to fix it and update the branch once I got something that I'm confident with it.

And I still have the test test_wps_ncml.py that I cannot fix, and that is invalid test in my opinion, it tries to retrieve data from not-existing server. Thus the test should be rethinked using client.get(...) instead of d3.retrieveData()

Best Regards.

gschwind commented 1 year ago

Hello,

I working on replacement of this patch suite, the first part is in #667

Best regards