Closed chaoran-chen closed 3 weeks ago
Thanks - it would be great if this health endpoint returned status 200
while LAPIS is waiting for its first data, which other endpoints do not atm (as far as I am aware)
As soon as https://github.com/GenSpectrum/LAPIS-SILO/issues/244 is implemented, waiting for the first data shouldn't be a problem anymore, though.
Yes, great! I wasn't sure of the timing there. In the post-implementation-of-244 case, it may be possible to just use an existing endpoint for this.
In case curl
and Bash aren't working well enough for making the check program, I could extend my api-query program[1] for the purpose. I've verified that I can produce a statically linked binary.
[1] https://github.com/pflanze/api-query/blob/master/src/main.rs
We already have some actuator endpoints enabled. That's probably the way we want to go.
For Kubernetes we already set it up for the Loculus backend:
From application.properties
:
springdoc.show-actuator=true
management.endpoints.enabled-by-default=false
management.endpoint.health.enabled=true
management.endpoints.web.exposure.include=health
management.health.livenessState.enabled=true
management.health.readinessState.enabled=true
From the Kubernetes deployment:
livenessProbe:
httpGet:
path: "/actuator/health/liveness"
port: 8079
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: "/actuator/health/readiness"
port: 8079
And then call one of these endpoints in the healthcheck.
For sending slack messages at a later point, check with me as I already made a separately runnable script for that with extracts from the servers repo, and I'm already due to integrating that back ino a GenSpectrum repo in a way.
Also see #941 for some more information.
Related and complementary to #812, we should have a
HEALTHCHECK
in Docker to ensure that Docker detects when LAPIS is non-responsive. (Thanks to @theosanderson for suggesting this.)