include tests for logs - Githubissues

bossie commented 1 year ago

Quite a bit of effort has been made to get centralized logging in place, of all components that make up our OpenEO back-ends, in Python and Java, in Spark drivers and executors, in web app and batch job contexts, with IDs such as user ID, request ID and job ID to correlate them.

Recently however, the logging infrastructure itself (filebeat, logstash etc) has been a bit unreliable for reasons not entirely known and logs were not visible in Kibana.

Maybe the current integration tests can be extended to make sure that our logging still behaves, both application-wise and infrastructure-wise.

For batch jobs, the /jobs/{job_id}/logs endpoint can be used.

For synchronous requests this is not so simple, because the request ID is not propagated to the client unless the request fails, in which case the request ID is returned in the error message. It is therefore not possible to look for logs that pertain to this particular request. Even if we always return this request ID in a response header, regardless of its outcome, a question that remains is: how will the openeo-python-client expose this request ID so we can fetch the corresponding logs?

bossie commented 1 year ago

We might consider a pragmatic approach and say: this synchronous request that we started a minute ago, succeeded, and therefore the logs of the last minute should contain these log entries.

soxofaan commented 1 year ago

some other quick ideas:

use a dedicated integration test user id and filter on that (too)
allow setting the request id from client side (with a header), instead of generating a random one back-end side

bossie commented 1 year ago

How would you allow the client (programmer) to set the request header? Keep it in a global/per-connection variable or are there better ways?

soxofaan commented 1 year ago

you can do both:

set an entry in the .default_headers dict on the connection object for "global" approach
pass headers arg (a dict) to con.get(), con.post(), ... for per request approach

Open-EO / openeo-geopyspark-integrationtests

include tests for logs #3