Change worker pool instance type from m5.4xlarge to m5.2xlarge

keycloak / keycloak-benchmark

Keycloak Benchmark

https://www.keycloak.org/keycloak-benchmark/

Apache License 2.0

132 stars 74 forks source link

Change worker pool instance type from m5.4xlarge to m5.2xlarge #1002

Closed kami619 closed 3 weeks ago

kami619 commented 1 month ago

### Tasks
- [x] update the workerpool type in all places both IaC and Docs
- [x] run the tests to make sure deployment goes through as expected
- [x] analyze the run results for any issues
- [x] Update Keycloak docs - https://github.com/keycloak/keycloak/blob/316e00bb98fd33bac0af5fdbaf7caac4481bee82/docs/guides/high-availability/concepts-memory-and-cpu-sizing.adoc?plain=1#L130

kami619 commented 1 month ago

https://github.com/kami619/keycloak-benchmark/actions/runs/11217372907 - deployment tests are underway.

kami619 commented 1 month ago

Based on the run there are a few observations,

We can set the worker pool to m5.2xlarge ec2 type but we need to increase the --min-replicas to 4 as the deployment failed to wait to get enough CPU cores for our current deployment with -min-replicas set to 3.

The pricing would still be 50% less for 4 x m5.2xl than the 3 x m5.4xl nodes, while satisfying the vCPU core count requirement for our deployment.

For 4 m5.2xlarge instances: 4 × $0.384/hour=$1.536/hour

For 3 m5.4xlarge instances: 3 x $0.768/hour=$2.304/hour

There was another worker pool which has no auto-scaling enabled with the m5.xlarge ec2 type node and that seems like something we could do away with, not sure why we need a second type of worker pool to begin with.

@ahus1 let me know and I will make the necessary changes needed.

ahus1 commented 1 month ago

Hi Kamesh, thank you for the analysis. Doing the math again, I think it would save us 33% on the worker node costs.

I would have hoped the cluster would scale fast enough for the deployment to complete without changing the min-nodes. Maybe there is a timeout you can tweak?

Still, let's proceed with the change, and change the min number of nodes to 4 as you suggested, as it will speed up the overall deployment. You might want to change the max nodes from 10 to 15 in case someone does another deployment on that cluster to be on the safe side.

mhajas commented 1 month ago

@kami619 @ahus1 Should we also update this: https://github.com/keycloak/keycloak/blob/316e00bb98fd33bac0af5fdbaf7caac4481bee82/docs/guides/high-availability/concepts-memory-and-cpu-sizing.adoc?plain=1#L130 as part of this issue

kami619 commented 1 month ago

Good point @mhajas, we could do that once we review the results from nightly, and let's propagate that change.