Open c-datculescu opened 6 months ago
I'm having the same issue. It's a catch-22.
I solved it with a custom readiness check script which always returns positive on the first check then only reports positive if the cached site available on 127.0.0.1:8080. But it's a dirty hack of course. What is the exact functionality for the frontend watch? What happens when I turn it off? It is related to distributing signals eg PURGE?
I have the same issue. It helps to have a Service
with .spec.publishNotReadyAddresses=true
but then another problem will appear.
When PODs are added (by scaling Deployment
or Statefulset
up) there is a race condition in pkg/watcher/endpoints_watch.go:89.
PODs are added to the Service but they are not necessarily in ready status for all conditions and the check on that line will discard this POD address from the list. After receiving the next event (after scaling up again for example) this skipped POD will be included (assuming it is ready now) but the next one will experience the same race condition and probably will be missed as well.
I would suggest adding a command line options to disable this check and always include all frontend/backend endpoints (depending on the cli options):
--no-frontend-condition-check
--no-backend-condition-check
I could prepare a PullRequest with those CLI options if this solution is acceptable.
Describe the bug When using clustering in combination with Readiness Gates (AWS ALB readiness gates), it is impossible to start the pods, because no endpoint will become available until the endpoints have been populated, but the endpoints will never be populated until the readiness gate passes. This ends up in a loop which never allows a pod to be fully started.
To Reproduce Steps to reproduce the behavior:
Expected behavior I would expect to be able to cluster the pods.
Environment:
Configuration
Additional context