Harbor core and jobservice health checks timeouts

jowko commented 3 years ago

We are using Harbor Helm Chart 1.6.2 (which contains Harbor v2.2.2) on Kubernetes 1.19. Core and jobservice pods are restarting from time to time because of the timeouts on readiness and liveness probes: core By default, timeout for health check if 1 second. When I run command below inside containers, sometimes it took a few seconds to respond (in most cases it respond quickly). curl localhost:8080/api/v2.0/ping I don't know what causes random long response time for health checks. But because of the Harbor probe settings (failureThreshold of 2 for core and default 3 for job service), these pods are restarted frequently.

In my opinion, Harbor should either configure bigger timeout for these services our expose configuration via values.yaml for these probes. In helm chart it can be done easily, when in values.yaml we place such section:

core:
  readinessProbe:
    failureThreshold: 2
    periodSeconds: 10

And then in our deployment definition we can put (example copied from my Helm chart, didn't tested for yml validity):

        readinessProbe:
          httpGet:
            path: /api/v2.0/ping
            scheme: {{ include "harbor.component.scheme" . | upper }}
            port: {{ template "harbor.core.containerPort" . }}
          {{- .Values.core.readinessProbe | toYaml | nindent 12 }}

Then any chart user can specify any options he/she wants.

ninjadq commented 3 years ago

Hi, in most of env the local network timeout will not exceed 1 second, I don't think we shouldn't edit this config item.

jowko commented 3 years ago

Most of requests in our environment also does not exceed 1 second, but from time to time it does. And when it happens 2 times in a row, then our service is restarted and this can occur multiple times per day. We don't need to change default timeout. We could make it configurable instead as in example above.

github-actions[bot] commented 7 months ago

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

jowko commented 7 months ago

Issue still exists

github-actions[bot] commented 5 months ago

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

jowko commented 5 months ago

Issue still exists

github-actions[bot] commented 2 months ago

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

github-actions[bot] commented 2 weeks ago

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

rossigee commented 5 days ago

We need to be able to manage 'timeoutSeconds' on the probes, otherwise Harbor just crashes unnecessarily whenever our server comes under higher-than-usual load.

goharbor / harbor-helm

Harbor core and jobservice health checks timeouts #986