Closed viters closed 2 years ago
The readiness probe takes one second to reply, according to logs. This can have many reasons but a slow DB being the most likely cause. Your probe probably cancels the request after a second, causing the pod to be killed!
@aeneasr Yeah, you are right, silly me. I actually had similar problem with other service, but that service did inform in logs about a problem:
storage in WARN state, the observed value 1116.7 is above the threshold of 750ms
I am closing issue, because it definitely is not a bug. Just, maybe some additional warning in logs could be helpful.
Just wanted to add a comment here just in case it helps anyone as we had this exact same issue and although I saw this thread in my search I didn't believe it was the fix to our problem (the problem being a new pod would take up to an hour to start)
The reason I didn't pay more attention to start with was mainly down to my inexperience with kubernetes readiness probes and partly because we had been running kratos for several months and with 5 different installations of it, most on different database clusters and one at a completely different location but this started happening to all of them at the same time.
The fix was just the same as mentioned above, add
timeoutSeconds: 2
to the readiness probe, now the pods start in about 50 seconds.
Thanks @aeneasr and @viters
Preflight checklist
Describe the bug
Extension to: https://github.com/ory/kratos/issues/1851.
Whenever I add readinessProbe to Kratos deployment, it never gets ready.
The deployment will not become ready - pod returns 503 (with no further explanation) no matter how much time passes (it gets restarted after pod gets marked as failing). Without readinessProbe, I can call
/health/ready
right after Kratos starts.Reproducing the bug
I use minikube on Macbook Pro M1 2020, problem was both in Big Sur and right now in Monterey (12.0.1).
/health/alive
requests work fine, but/health/ready
returns 503 and prevents deployment from being marked as ready.readinessProbe
from deployment configuration, just as after pod starts, I can forward traffic and call/health/ready
myself with success:{"status":"ok"}
Relevant configuration
Everything is in Kustomization.
Version
oryd/kratos:v0.8.0-alpha.3-sqlite
On which operating system are you observing this issue?
macOS
In which environment are you deploying?
Kubernetes