geopython / pygeoapi

pygeoapi is a Python server implementation of the OGC API suite of standards. The project emerged as part of the next generation OGC API efforts in 2018 and provides the capability for organizations to deploy a RESTful OGC API endpoint using OpenAPI, GeoJSON, and HTML. pygeoapi is open source and released under an MIT license.
https://pygeoapi.io
MIT License
459 stars 250 forks source link

Add job before returning the response #1672

Closed aulemahal closed 3 weeks ago

aulemahal commented 1 month ago

Hi!

Overview

This makes the base process manager add the job to it's database before returning any response to the client.

Related Issue / discussion

In my testing, I send many async jobs requests at the same time. Once the response header comes back, I ping the server every second to see if the job succeeded.

On my slow VM, with more than 10 simultaneous requests, the first ping fails because the job doesn't exist. I receive the following response from the server : {'code': 'InvalidParameterValue', 'type': 'InvalidParameterValue', 'description': '<job id>'}.

Diving in the code, I see that the first add_job call of the manager, that registers the job to the db, happens within the execution function. In the async case, this means that the Process creation and start happens between the initial response and the add_job . In the case of (relatively) large loads like in my example, this takes more than 1 sec.

In any case, it feels better if the response sent with the accepted status only happens once the job is registered to the database.

Additional information

I just started playing with pygeoapi, I might have missed something that justified the previous behaviour!

Dependency policy (RFC2)

Updates to public demo

Contributions and licensing

(as per https://github.com/geopython/pygeoapi/blob/master/CONTRIBUTING.md#contributions-and-licensing)