kartoza / geodata-mart

Geoprocessing as a Service
https://data.kartoza.com
2 stars 3 forks source link

Resolve WPS processing issues #49

Open zacharlie opened 2 years ago

zacharlie commented 2 years ago

Ideally we should be in a position to leverage the py-qgis-wps framework for a scalable and powerful approach to backend processing, which provides a status api and provides many of the benefits of a generic WPS, along with direct support for qgis providers, plugins, scripts, and models.

A simple stack can be setup up with the following docker-compose.yaml:

version: "3"

services:
  wps:
    image: 3liz/qgis-wps:ltr-rc
    platform: linux/x86_64
    environment:
      QGSWPS_SERVER_PARALLELPROCESSES: "2"
      QGSWPS_SERVER_LOGSTORAGE: REDIS
      QGSWPS_REDIS_HOST: wpsredis
      QGSWPS_PROCESSING_PROVIDERS_MODULE_PATH: /processing
      QGSWPS_CACHE_ROOTDIR: /projects
      QGSWPS_SERVER_WORKDIR: /srv/data
      # QGSWPS_USER: 1000:1000
      QGSRV_SERVER_RESPONSE_TIMEOUT: 1800
      QGSRV_SERVER_CROSS_ORIGIN: "yes"
      QGSWPS_LOGLEVEL: DEBUG
    volumes:
      - ./projects:/projects
      - ./processing:/processing
      - ./output:/srv/data
    ports:
      - "9999:8080"

  wpsredis:
    image: redis:5-alpine

The initial script developed for testing is on my development branch under commit c42ca95, which can be downloaded directly from github:

The "test" directory contains a small sample data set that can be used for evaluation:

Note that QGIS Processing will require the script to be added to the default profile, as the processing framework does not currently support profiles. The script can be run locally with QGIS Processing, e.g.

C:\OSGeo4W\bin\qgis_process-qgis-ltr.bat run script:gdmclip --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --LAYERS='world, dem' --CLIP_GEOM='Polygon ((28.5 -28.0, 28.5 -29.0, 29.5 -29.0, 29.5 -28.0, 28.5 -28.0))' --OUTPUT_CRS='EPSG:4326' --BUFFER_DIST_KM=50 --PROJECT_PATH='c:/test/projects/sample.qgs' --OUTPUT='c:/test/output/geodatamart'

This produces a zip file with gpkg and associated rasters

resetting raster paths in output project seems to be a bit buggy will fix

Storing the raster to geopackage does not work within the processing framework. When using the CLI, it throws an error. When using the same processing script from the QGIS GUI (3.26) it throws an error, but writes the output raster to the geopackage. When using the processing tools driectly in QGIS, it works. The RASTER_TABLE parameter does not work properly though unless used as TABLE.

Instead of a single clip and transform with GDAL, instead a multi-step process has been used to try identify or address issues as a work around, with the geopackage output raster process being commented out and replaced with a flat file output raster to include in a zip.

Assessing the WPS capabilities, operations, input, and output requirements can be performed with OWSLib, or alternatively, with a little mor work, vanilla requests as outlined by the py-qgis-wps tests.

OWSLib is easily installed with conda and is utilised as demonstrated below:

from owslib.wps import WebProcessingService, printInputOutput

# https://geopython.github.io/OWSLib/usage.html#wps

wps = WebProcessingService(
    "http://127.0.0.1:9999/ows/?service=WPS&MAP=sample", verbose=False, skip_caps=True
)
wps.getcapabilities()
wps.identification.type
wps.identification.title
wps.identification.abstract
for operation in wps.operations:
    operation.name

for process in wps.processes:
    process.identifier, process.title

processes = [process for process in wps.processes]

for process in processes:
    print(process.identifier)
    process = wps.describeprocess(process.identifier)
    for output in process.processOutputs:
        printInputOutput(output)
    # for output in process.processOutputs:
    #     printInputOutput(output)
    # process.identifier
    # process.title
    # process.abstract
    print("----------")

Execute process:

from owslib.wps import WebProcessingService, monitorExecution
import re
import uuid

output_name = uuid.uuid4().hex
wps = WebProcessingService(
    "http://127.0.0.1:9999/ows/?service=WPS&MAP=sample", verbose=False, skip_caps=True
)
processid = "script:gdmclip"
inputs = [
    ("LAYERS", "world, dem"),
    ("CLIP_GEOM", "POLYGON((29.0 29.0,29.0 30.0,30.0 30.0,30.0 29.0,29.0 29.0))"),
    ("OUTPUT_CRS", "4326"),
    ("BUFFER_DIST_KM", "50"),
    ("OUTPUT", "OUTPUT"),
]

execution = wps.execute(processid, inputs, output=output_name)

response = str(execution.response)

id = re.search(r"uuid=(.*?)\"", response).group(1)

print(id)

monitorExecution(execution)

The result is an "internal error" which has not been resolved despite various attempted workarounds and experiments.

Using the template processing script from QGIS (simple vector buffer), works with the WPS and compose stack provided.

In the meantime, the plan is to use QGIS Processing directly in the django/ celery containers and revisit the WPS utilization later.

NyakudyaA commented 2 years ago

I tried to run the clip outputting to raster and it works on my machine running Q3.26

qgis_process run gdal:cliprasterbyextent --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --INPUT=/gis/src/kartoza_work/geodata-mart/geodata/test/projects/dem.tif --PROJWIN='23.750467063,25.932131664,-29.223057891,-27.450455402 [EPSG:4326]' --OVERCRS=false --OPTIONS= --DATA_TYPE=0 --EXTRA= --OUTPUT=/tmp/demo1.gpkg

Screenshot from 2022-07-17 17-22-23

zacharlie commented 2 years ago

I'll need to do further tests on that, but will focus efforts on 3.22 (maybe switching to 3.28 in October) so that we can make sure whatever processes we decide on are compatible with an LTR. It's a bit of a struggle to work against inconsistent results. For now I am parking the WPS support as it's rather difficult to debug, and running django against a QGIS container to execute custom scripts via a celery worker. The updated clipping script is storing rasters on the file system adjacent to the geopackage, and then zipping everything up as an output. I think pushing that output to a project defined S3 bucket might be the next step forward, and once that is running smoothly I'll revisit using WPS and storing rasters in the gpkg directly...

NyakudyaA commented 2 years ago

Ok, cool so if I want to test the whole process do I just use your develop branch and start debugging from there

zacharlie commented 2 years ago

Yeah I think that is the best approach for now... Although if you want to run more tests against WPS directly (especially with OWSLib) for running custom models and scripts I think that is the highest value, because it can be used in this stack and for other projects