Open jichen-amplify opened 3 years ago
Hi, we have a load balancer sitting in front of our Keycloak cluster. As a new Keycloak instance is just starting and trying to join the cluster, we would like the load balancer to start forwarding requests to this new Keycloak instance only when the new Keycloak instance is healthy and has successfully joined the cluster. Can these health check endpoints be used by the load balancer to detect whether the new Keycloak instance is ready to receive any live requests?
My understanding is "absolutely" and it's exactly what I'm about to test in DEV for my team. You could set your LB to target the infinispan healthcheck endpoint specifically on your configured port and expect a 200. If an instance fails w/ a 503, it means that instance isn't yet recognized as part of the cluster. You could manually confirm in testing by curling the endpoint while it's failing a healthcheck and seeing if the "numberOfNodes" = what it should be and that "nodeNames" lists all the necessary hosts.
Can't imagine it'll be today or maybe not even tomorrow, but I'll set a reminder to come back and update you once I've tested on our end. Cheers!
Thanks for responding to me.
We have actually tested this by spinning up a new instance while running a load test on our test Keycloak cluster. What ended up happening was seemingly that the new instance would caused the whole cluster to get in an inconsistent state and all requests would start failing. This could very well be due to how we set up our cluster. I posted the question so that we could rule out it's not because this health check extension doesn't support this feature.
Please let us know your test result. Thanks again!
No worries! I've been down a monitoring hellpath for the past 24hr so it just felt nice to see someone in a similar situation haha.
How is your cluster setup? Is your LB's healthcheck trying to hit a shared cluster endpoint, or do you have it running a health check on all instances/containers in the cluster?
Ours is set up so that the LB is running the healthcheck on each instance in the cluster, and we've got some failure intervals in place so that it's not considered an unhealthy instance until 3 checks in a row have failed.
Our hope is that we can just reconfigure our healthchecks to point to the infinispan cluster status endpoint for each instance. So a new instance coming into the cluster wouldn't trigger any scaling or replacement events unless it simply had big issues joining the cluster to begin with. Otherwise the expected behaviour would be to reach healthy state, at which point we know the cluster status endpoint for that instance is healthy and we can expect the instance is handling cluster traffic.
I'll be sure to get back to you once we've got our test cluster in place or we test on our DEV cluster. =]
We have the same approach as yours. We are currently running our cluster in AWS EC2 as an auto scaling group (ASG). We have a load balancer sitting in front of the ASG and it is configured to monitor the health of each instance in the group using a health check endpoint. The load balancer would only forward the live traffic to an instance if its health check endpoint returns a 200 status code (we also have multiple checks in a row for detecting failures).
@jichen-amplify apologies for the delay in getting back to you.
We've tested this in DEV for a couple weeks, everything was great, and I've rolled it into PROD.
Basically we did two things:
There are a couple other things you could do here, too:
Hope this helps a bit & good luck!
Hi, we have a load balancer sitting in front of our Keycloak cluster. As a new Keycloak instance is just starting and trying to join the cluster, we would like the load balancer to start forwarding requests to this new Keycloak instance only when the new Keycloak instance is healthy and has successfully joined the cluster. Can these health check endpoints be used by the load balancer to detect whether the new Keycloak instance is ready to receive any live requests?