kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.48k stars 1.07k forks source link

Cron Scaler: desiredReplicas is 0 but it sets to 1 anyway #5586

Open ecerulm opened 7 months ago

ecerulm commented 7 months ago

Report

I have a new namespace, deployment and scaledobject

Althought the example is contrived, it's the simplest setup that I could get to reproduce the problem. The actual problem is that I wanted to use the cron scaler to downscale to 0 certain deployment outside working hours since that deployment runs on expensive hardware that is provisioned/deprovisioned dynamically.

I can see that initially the HPA says

  AbleToScale    True    SucceededGetScale  the HPA controller was able to get the target's current scale
  ScalingActive  False   ScalingDisabled    scaling is disabled since the replica count of the target is zero

then after the cron scaler does it's thing. it changes to

Name:                                                                 keda-hpa-nginx-deployment
Namespace:                                                            rubentest
Labels:                                                               app.kubernetes.io/managed-by=keda-operator
                                                                      app.kubernetes.io/name=keda-hpa-nginx-deployment
                                                                      app.kubernetes.io/part-of=nginx-deployment
                                                                      app.kubernetes.io/version=2.13.1
                                                                      scaledobject.keda.sh/name=nginx-deployment
Annotations:                                                          <none>
CreationTimestamp:                                                    Fri, 08 Mar 2024 15:53:29 +0100
Reference:                                                            Deployment/nginx-deployment
Metrics:                                                              ( current / target )
  "s0-cron-Europe-Stockholm-5615xxx-5616xxx" (target average value):  0 / 1
Min replicas:                                                         1
Max replicas:                                                         7
Behavior:
  Scale Up:
    Stabilization Window: 0 seconds
    Select Policy: Max
    Policies:
      - Type: Pods     Value: 4    Period: 15 seconds
      - Type: Percent  Value: 100  Period: 15 seconds
  Scale Down:
    Stabilization Window: 30 seconds
    Select Policy: Max
    Policies:
      - Type: Percent  Value: 100  Period: 15 seconds
Deployment pods:       1 current / 1 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from external metric s0-cron-Europe-Stockholm-5615xxx-5616xxx(&LabelSelector{MatchLabels:map[string]string{s
caledobject.keda.sh/name: nginx-deployment,},MatchExpressions:[]LabelSelectorRequirement{},})
  ScalingLimited  True    TooFewReplicas    the desired replica count is less than the minimum replica count
Events:           <none>

it seem hat the metric s0-cron-Europe-Stockholm-5615xxx-5616xxx says to scale to 1 . In any case the net result is that after that trigger a pod is created, when the desired was 0 (and the current was 0 also)

Expected Behavior

I was expecting that if the cron scaler says desiredReplicas: "0" no pod will be created.

Actual Behavior

Seems to create pod / increase desired replicas to 1

Steps to Reproduce the Problem

Steps to reproduce:

manifest.jsonnet:

local timeIncrement = 1;

local labels = {
  app: 'nginx',
};

local namespace = {
  apiVersion: 'v1',
  kind: 'Namespace',
  metadata: {
    name: std.extVar('namespace')
  }
};

local deployment = {
  apiVersion: 'apps/v1',
  kind: 'Deployment',
  metadata: {
    name: 'nginx-deployment',
    namespace: namespace.metadata.name,
  },
  spec: {
    selector: {
      matchLabels: labels,
    },
    replicas: 0,
    template: {
      metadata: {
        labels: labels,
      },
      spec: {
        containers: [
        {
          name: 'nginx',
          image: 'nginx:latest',
          ports: [
          {
            containerPort: 80,
          },
          ],
        }
        ]
      },
    },
  }
};

local scaledobject = {
  apiVersion: "keda.sh/v1alpha1",
  kind: "ScaledObject",
  metadata: {
    name: deployment.metadata.name,
    namespace: namespace.metadata.name,
    annotations: {
      // "autoscaling.keda.sh/paused-replicas": "0",                # Optional. Use to pause autoscaling of objects
      // "autoscaling.keda.sh/paused": "true",                # Optional. Use to pause autoscaling of objects
    },
  },
  spec: {
    scaleTargetRef: {
      name: deployment.metadata.name,
    },
    idleReplicaCount: 0,
    minReplicaCount: 0,
    maxReplicaCount: 7,
    advanced: {
      horizontalPodAutoscalerConfig: {
        behavior: {
          scaleDown: {
            stabilizationWindowSeconds: 30,
          },
        },

      },
    },
    triggers: [
    {
      type: "cron",
      metadata: {
        timezone: "Europe/Stockholm",
        start: std.extVar("currentmin") + " * * * *",
        end: std.extVar("currentmin") + timeIncrement +  " * * * *",
        desiredReplicas: "0",
      },
    },
    // {
    //   type: "cron",
    //   metadata: {
    //     timezone: "Europe/Stockholm",
    //     start: std.extVar("currentmin") + timeIncrement +  " * * * *",
    //     end: std.extVar("currentmin") + (timeIncrement*2) + " * * * *",
    //     desiredReplicas: "4",
    //   },
    // },
    // {
    //   type: "cron",
    //   metadata: {
    //     timezone: "Europe/Stockholm",
    //     start: std.extVar("currentmin") + (timeIncrement*2) +  " * * * *",
    //     end: std.extVar("currentmin") + (timeIncrement*3) + " * * * *",
    //     desiredReplicas: "0",
    //   },
    // },
    ],
  },
};

[namespace,deployment,scaledobject]

Running this deletes the namespace "rubentest" and all the resources under, then recreates the namespace,deployment and scaledobject with a cron scaler set to trigger 3 minutes after current date.

kubectl delete ns rubentest; jsonnet -y -J vendor --ext-str namespace=rubentest --ext-code currentmin=$(date -v +3M +%M)  manifest.jsonnet|tee /dev/tty|kubectl apply -f -

Logs from KEDA operator

Note the "Original Replicas Count": 0, "New Replicas Count": 1} below


08T14:53:29Z INFO Reconciling ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "818f738b-2ea8-4143-8086-5c82cf92c035"}
2024-03-08T14:53:29Z INFO Adding Finalizer for the ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "818f738b-2ea8-4143-8086-5c82cf92c035"}
2024-03-08T14:53:29Z INFO Detected resource targeted for scaling {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "818f738b-2ea8-4143-8086-5c82cf92c035", "resource": "apps/v1.Deployment", "name": "nginx-deployment"}
2024-03-08T14:53:29Z INFO Creating a new HPA {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "818f738b-2ea8-4143-8086-5c82cf92c035", "HPA.Namespace": "rubentest", "HPA.Name": "keda-hpa-nginx-deployment"}
2024-03-08T14:53:29Z INFO Initializing Scaling logic according to ScaledObject Specification {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "818f738b-2ea8-4143-8086-5c82cf92c035"}
2024-03-08T14:53:29Z INFO Reconciling ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "ef56fe27-a455-4623-9120-57d44c277671"}
2024-03-08T14:53:29Z INFO Detected resource targeted for scaling {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "ef56fe27-a455-4623-9120-57d44c277671", "resource": "apps/v1.Deployment", "name": "nginx-deployment"}
2024-03-08T14:53:29Z INFO Reconciling ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "77c794c1-2eb8-4942-89b2-0c7377652c29"}
2024-03-08T14:53:29Z INFO Detected resource targeted for scaling {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "77c794c1-2eb8-4942-89b2-0c7377652c29", "resource": "apps/v1.Deployment", "name": "nginx-deployment"}
2024-03-08T14:56:29Z INFO scaleexecutor Successfully updated ScaleTarget {"scaledobject.Name": "nginx-deployment", "scaledObject.Namespace": "rubentest", "scaleTarget.Name": "nginx-deployment", "Original Replicas Count": 0, "New Replicas Count": 1}
2024-03-08T14:56:59Z INFO Reconciling ScaledObject {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "3b3ee23b-bdc5-43bc-85f0-ab252fd5032d"}
2024-03-08T14:56:59Z INFO Detected resource targeted for scaling {"controller": "scaledobject", "controllerGroup": "keda.sh", "controllerKind": "ScaledObject", "ScaledObject": {"name":"nginx-deployment","namespace":"rubentest"}, "namespace": "rubentest", "name": "nginx-deployment", "reconcileID": "3b3ee23b-bdc5-43bc-85f0-ab252fd5032d", "resource": "apps/v1.Deployment", "name": "nginx-deployment"}```

### KEDA Version

2.13.1

### Kubernetes Version

1.27

### Platform

Amazon Web Services

### Scaler Details

Cron Sclaer 

### Anything else?

https://github.com/kedacore/keda/discussions/5585#discussioncomment-8719726

Fixes https://github.com/kedacore/keda-docs/pull/1332
ecerulm commented 7 months ago

Apparently if I understood right from https://github.com/kedacore/keda/discussions/5585#discussioncomment-8719171, https://github.com/kedacore/keda/issues/3609, https://github.com/kedacore/keda/issues/1759 , https://github.com/kedacore/keda/issues/2153, https://github.com/kedacore/keda/issues/4474, https://github.com/kedacore/keda/issues/3956,

It seems that setting desiredReplicas: 0 with the cron scaler is not the right approach.

Would it be possible to do validation on keda side ? It seems (if I understood right) that desiredReplicas: 0 is always an error.

JorTurFer commented 7 months ago

The problem is that the scaler is just a metric source. Cron scaler isn't 100% clear about the behaviour, but desiredReplicas is not idempotent, it's just the value that will be exposed to the HPA controller, so if there is any other metric that requires more than 0, you won't scale to 0

JorTurFer commented 7 months ago

In this case, instead of setting the cron with desiredReplicas: 0, you have to set the opposite cron, when you want to have AT LEAST desiredReplicas

ecerulm commented 7 months ago

Cron scaler isn't 100% clear about the behaviour

Yes, I already proposed some documentation changes at https://github.com/kedacore/keda-docs/pull/1332

But maybe you can propose some more clear wording in that PR since you seem to know better the details and terminology.

if there is any other metric that requires more than 0, you won't scale to 0

But there is no other metrics at all (that I know of) The only one that shows up is

Metrics:                                                              ( current / target )
  "s0-cron-Europe-Stockholm-5615xxx-5616xxx" (target average value):  0 / 1

I guess that why it's so hard for me to understand that this is expected, the only explicit intruction is to set it to 0 , where is the conflicting instruction to set it to 1 coming from?

In this case, instead of setting the cron with desiredReplicas: 0, you have to set the opposite cron, when you want to have AT LEAST desiredReplicas

Yes, and this is what I try to be explicit about in https://github.com/kedacore/keda-docs/pull/1332 For reference I'll include here too, the following ScaledObjec uses cron scaler to scale to 10 between 6AM to 20PM and it will scale down to 0 outside that period:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: cron-scaledobject
  namespace: default
spec:
  scaleTargetRef:
    name: my-deployment
  minReplicaCount: 0
  triggers:
  - type: cron
    metadata:
      timezone: Asia/Kolkata
      start: 0 6 * * *
      end: 0 20 * * *
      desiredReplicas: "10"

But desiredReplicas: 0 in cron scaler will never have the intended result (again if I understoon right) so would it be better to have some validation refusing it? Or is there any valid use of desiredReplicas: 0 for the cron scaler?

Anyway at least maybe the explicit documentation will suffice, I know I wanted to set desiredReplicas: 0 explicitly because I didn't understand from the current cron scaler documentation that it will scale automatically to 0 since that was the default minReplicaCount.

stale[bot] commented 5 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

joebowbeer commented 3 months ago

@ecerulm Should be closed as WAI (working as intended) ?

As already mentioned, specifying 0 desiredReplicas in a trigger is almost certainly a mistake.

The reason KEDA is scaling to 1 in this case is because of how the activation phase works.

The cron scaler is active whenever the current time is between the start and end times. If any trigger is active, then KEDA scales from 0 to 1 (activation phase), and then HPA scales from 1 .. desiredReplicas.

joebowbeer commented 2 months ago

There is one case where desiredReplicas=0 is the simplest approach: when using a cron trigger in a formula.

In a formula, there's no way to differentiate an active cron from an inactive cron except by the difference in desiredReplicas. The inactive cron will always yield the value 1 (see defaultDesiredReplicas in cron_scaler.go), so the simplest way (arguably) to differentiate the active cron is to set its desiredReplicas=0.

NaruSaku commented 4 weeks ago

Hi, I'm not using cron scaler but basic Scaledobject. I set min replica to 0, and HPA shows 1 as min replica, ok that's expected. But similarly, I get below. I'm pretty sure the keda is not active state.

Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from external metric s2-prometheus(&LabelSelector{MatchLabels:map[string]string{scaledobject.keda.sh/name: ccs-integration-gaia-triton-server-scaledobject,},MatchExpressions:[]LabelSelectorRequirement{},})
  ScalingLimited  True    TooFewReplicas    the desired replica count is less than the minimum replica count

This makes the replica oscillating between 0 and 1 forever. Very strangely, with the exact same code/setup in another k8s env, keda is working as expected

Conditions:
  Type            Status  Reason             Message
  ----            ------  ------             -------
  AbleToScale     True    SucceededGetScale  the HPA controller was able to get the target's current scale
  ScalingActive   False   ScalingDisabled    scaling is disabled since the replica count of the target is zero
  ScalingLimited  True    TooFewReplicas     the desired replica count is less than the minimum replica count