hapifhir / hapi-fhir-jpaserver-starter

Apache License 2.0
395 stars 1.05k forks source link

Unable to disable liveness/startup/readiness probes when installing with helm #548

Closed gmej closed 1 year ago

gmej commented 1 year ago

There is no possibility to disable kubernetes probes when installing the server with Helm chart. It would be nice to have this option to disable probes.

I have tried, in the values.yaml, to delete the sections, or even leaving the probes undefined, but none of these work:

#readinessProbe:
#  failureThreshold: 10
#  initialDelaySeconds: 300
#  periodSeconds: 20
#  successThreshold: 1
#  timeoutSeconds: 200

startupProbe:
chgl commented 1 year ago

You're right, it currently can't be disabled entirely, that would be nice to have though. Could be implemented by moving https://github.com/hapifhir/hapi-fhir-jpaserver-starter/blob/master/charts/hapi-fhir-jpaserver/templates/deployment.yaml#L70 above the probe definition - that would allow disabling the probe entirely by setting livenessProbe: {} etc. PR welcome!

XcrigX commented 1 year ago

Curious, what's the use-case for disabling the probes? If it's for troubleshooting without the pod being restarted couldn't you just set them very high?

gmej commented 1 year ago

The use-case is an ongoing development, not a production ready system. I would prefer configuring than disabling these probes, but I am not able to configure them.

This is the output of kubectl describe deployment fhir (the output is trimmed to the info that is relevant):

Pod Template:
  Containers:
   hapi-fhir-jpaserver:
    Image:       docker.io/hapiproject/hapi:v6.6.0
    Ports:       8080/TCP, 8081/TCP
    Host Ports:  0/TCP, 0/TCP
    Liveness:    http-get http://:http/livez delay=300s timeout=300s period=20s #success=1 #failure=10
    Readiness:   http-get http://:http/readyz delay=300s timeout=200s period=20s #success=1 #failure=10
    Startup:     http-get http://:http/readyz delay=600s timeout=300s period=30s #success=1 #failure=10

The urls are odd, and also I don't know if this is a bug or if I have to configure the cluster to know what these endpoints are.

XcrigX commented 1 year ago

That looks unfamiliar to me, I thought the /readyz and /livez endpoints were for the K8s API server? In my (custom) K8s deployment I set the probes to point to /actuator/health which pings the spring boot actuator health endpoints.

When I describe the deployment it looks like:

    Liveness:   http-get https://:8443/actuator/health delay=30s timeout=5s period=60s #success=1 #failure=3
    Readiness:  http-get https://:8443/actuator/health delay=10s timeout=1s period=3s #success=1 #failure=3
    Startup:    http-get https://:8443/actuator/health delay=30s timeout=1s period=5s #success=1 #failure=1000
chgl commented 1 year ago

A more recent version of the helm Chart set the MANAGEMENT_ENDPOINT_HEALTH_PROBES_ADD_ADDITIONAL_PATHS env var, with MANAGEMENT_SERVER_PORT set to 8081. This moves the actuator endpoint to a different port but keeps the health checks at 8080 under the readyz/livez path. The idea is that this makes it more secure since e.g. an external Prometheus only needs access to, and scrape the metrics port while the probes are still on the main http endpoint.

Currently, you can't disable the probes entirely unless using a chart post renderer but you could set the timeouts very high, or possibly create a PR to fix - I might address it eventually.

XcrigX commented 1 year ago

Ahhh - I wasn't aware of the K8s-specific actuator properties. That's cool.

joshuabaird commented 10 months ago

@chgl Sorry to dig up an old thread -- but where are these probe endpoints actually created/defined? They are 401'ing for me -- and I'm just trying to figure out why.

XcrigX commented 9 months ago

@joshuabaird - These probes are created by SpringBoot. See https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html#actuator.endpoints.kubernetes-probes

chgl commented 9 months ago

@chgl Sorry to dig up an old thread -- but where are these probe endpoints actually created/defined? They are 401'ing for me -- and I'm just trying to figure out why.

No worries, that does seem strange and maybe something I can fix in the chart. Any chance you have enabled authentication somewhere? If it's based on basic-auth, we can add it to the probes.

joshuabaird commented 9 months ago

@chgl Sorry to dig up an old thread -- but where are these probe endpoints actually created/defined? They are 401'ing for me -- and I'm just trying to figure out why.

No worries, that does seem strange and maybe something I can fix in the chart. Any chance you have enabled authentication somewhere? If it's based on basic-auth, we can add it to the probes.

Yea - the main problem was that auth was enabled on those endpoints and I didn't know it!