Closed ZeidH closed 1 day ago
Hi @ZeidH. Thanks for your PR.
I'm waiting for a infinispan member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Thanks for the clear instructions @ryanemerson!, I'm rather new to this so appreciate the help. Encountered this issue with controller-gen on my version of go (1.22.5) so I also updated it to latest (1.15.0).
Thanks for updating @ZeidH. It seems like we need to increase our baseline go version in order to use that version of controller-gen. I have created a PR to baseline on 1.21 (1.20 is now EOL) and update the controller-gen version https://github.com/infinispan/infinispan-operator/pull/2134. I'll let you know once that PR has been merged so that you can rebase this PR.
I've tested the autoscaling with this setup:
./scripts/ci/kind-with-olm.sh
make deploy-cert-manager
helm install keda kedacore/keda --namespace keda --create-namespace
skaffold dev
and a ScaledObject
that targets the Infinispan CR
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: infinispan-scaler
namespace: infinispan-operator-system
spec:
minReplicaCount: 0
scaleTargetRef:
apiVersion: infinispan.org/v1
kind: Infinispan
name: test-cache
triggers:
- metadata:
desiredReplicas: '2'
start: 52 8 * * *
end: 59 8 * * *
timezone: UTC
type: cron
---
apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
annotations:
infinispan.org/monitoring: 'true'
infinispan.org/operatorPodTargetLabels: 'rht.comp,rht.comp_ver,rht.prod_name,rht.prod_ver,rht.subcomp_t'
name: test-cache
namespace: infinispan-operator-system
spec:
service:
container:
livenessProbe:
failureThreshold: 5
initialDelaySeconds: 0
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 5
initialDelaySeconds: 0
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
startupProbe:
failureThreshold: 600
initialDelaySeconds: 3
periodSeconds: 1
successThreshold: 1
timeoutSeconds: 1
type: DataGrid
jmx: {}
configListener:
enabled: true
logging:
level: info
upgrades:
type: Shutdown
replicas: 1
The Statefulset was able to scale from zero to n and back but I can imagine that for some configurations scaling to zero is not possible
Apologies for the delay @ZeidH, but main is now baselined on go 1.21. Can you please rebase on the latest code?
but I can imagine that for some configurations scaling to zero is not possible
Can you explain which configurations you think would cause issues?
but I can imagine that for some configurations scaling to zero is not possible
Can you explain which configurations you think would cause issues?
I can't think of any specific to Infinispan, the amount of available configurations is so immense it would take a long time to figure out what config can be autoscaled, and what cannot. I think after learning a bit on what Infinispan does the past few months, the configs where autoscaling doesn't work well also most likely wouldn't make much sense to use autoscaling/scale to zero, like clustered caches where you (almost)always want to have a hot instance.
But from experience there can be issues with autoscaling when using specific types of storage configurations, or as @rigazilla mentioned in the GH Issue: about the operator sometimes needs to explicitly set the .spec.Replicas
, i.e. in upgrade
What we attempted to do is to dynamically scale a replicated cache with the amount of nodes (1 cache per node), essentially making Infinispan behave like a DaemonSet
. Although we're not done with it just yet, so far it's been working out well.
Thanks @ZeidH and sorry for the delay. I'll create a follow up PR with some documentation. Let me know if you need this feature soon via OLM and I can expedite the next release.
Enhanced the
spec.subresources
field for theInfinispan
CRD to map scalingThis enables APIs like the HorizontalPodAutoscaler and KEDA to modify the replica count of the Cache.