apecloud / kubeblocks

KubeBlocks is an open-source control plane software that runs and manages databases, message queues and other stateful applications on K8s.
https://kubeblocks.io
GNU Affero General Public License v3.0
2.06k stars 167 forks source link

[BUG]pods are deleted when vscale cpu/memory limit exceed namespace quota #5366

Open ahjing99 opened 11 months ago

ahjing99 commented 11 months ago

➜ ~ kbcli version Kubernetes: v1.27.3-gke.100 KubeBlocks: 0.6.3-beta.3 kbcli: 0.6.3-beta.3

When vscale cpu/memory limit exceed namespace quota, pods will be deleted, we should block the ops at the beginning when it exceed quota

  1. Create ns with quota
    kubectl apply -f -<<EOF
    apiVersion: v1
    items:
    - apiVersion: v1
    kind: ResourceQuota
    metadata:
    name: quota-ns-ukltji
    namespace: ns-ukltji
    spec:
    hard:
      limits.cpu: "2"
      limits.ephemeral-storage: 10Gi
      limits.memory: 2Gi
      requests.storage: 10Gi
    status:
    hard:
      limits.cpu: "2"
      limits.ephemeral-storage: 10Gi
      limits.memory: 2Gi
      requests.storage: 10Gi
    used:
      limits.cpu: "0"
      limits.ephemeral-storage: "0"
      limits.memory: "0"
      requests.storage: "0"
    kind: List
    metadata:
    resourceVersion: ""
    ---
    apiVersion: v1
    kind: LimitRange
    metadata:
    name: range-ns-ukltji
    namespace: ns-ukltji
    spec:
    limits:
    - default:
      cpu: 100m
      memory: 100Mi
    type: Container
    EOF
  2. Create role
    kubectl apply -f -<<EOF
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
    labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
    name: dbname
    namespace: ns-ukltji
    rules:
    - apiGroups:
      - ''
    resources:
      - events
    verbs:
      - create
    - apiGroups:
      - ''
    resources:
      - configmaps
    verbs:
      - create
      - get
      - list
      - patch
      - update
      - watch
      - delete
    - apiGroups:
      - ''
    resources:
      - endpoints
    verbs:
      - create
      - get
      - list
      - patch
      - update
      - watch
      - delete
    - apiGroups:
      - ''
    resources:
      - pods
    verbs:
      - get
      - list
      - patch
      - update
      - watch
    EOF
  3. Create SA RoleBinding
    kubectl apply -f -<<EOF
    apiVersion: v1
    kind: ServiceAccount
    metadata:
    labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
    name: dbname
    namespace: ns-ukltji
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
    labels:
    app.kubernetes.io/instance: dbname
    app.kubernetes.io/managed-by: kbcli
    name: dbname
    namespace: ns-ukltji
    roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: Role
    name: dbname
    subjects:
    - kind: ServiceAccount
    name: dbname
    namespace: ns-ukltji
    EOF
  4. Create cluster
    
    kubectl create -f -<<EOF
    apiVersion: apps.kubeblocks.io/v1alpha1
    kind: Cluster
    metadata:
    labels:
    clusterdefinition.kubeblocks.io/name: mongodb
    clusterversion.kubeblocks.io/name: mongodb-5.0
    generateName: mongo-
    namespace: ns-ukltji
    spec:
    affinity:
    nodeLabels: {}
    podAntiAffinity: Preferred
    tenancy: SharedNode
    topologyKeys: []
    clusterDefinitionRef: mongodb
    clusterVersionRef: mongodb-5.0
    componentSpecs:
    - componentDefRef: mongodb
    monitor: true
    name: mongodb
    replicas: 1
    resources:
      limits:
        cpu: 1000m
        memory: 1024Mi
      requests:
        cpu: 100m
        memory: 102Mi
    serviceAccountName: dbname
    volumeClaimTemplates:
    - name: data
      spec:
        storageClassName: standard-rwo
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 5Gi
    terminationPolicy: WipeOut
    tolerations: []
    EOF

➜ ~ kbcli cluster describe -n ns-ukltji mongo-ggbgx Name: mongo-ggbgx Created Time: Oct 10,2023 11:25 UTC+0800 NAMESPACE CLUSTER-DEFINITION VERSION STATUS TERMINATION-POLICY ns-ukltji mongodb mongodb-5.0 Running WipeOut

Endpoints: COMPONENT MODE INTERNAL EXTERNAL mongodb ReadWrite mongo-ggbgx-mongodb.ns-ukltji.svc.cluster.local:27017

Topology: COMPONENT INSTANCE ROLE STATUS AZ NODE CREATED-TIME mongodb mongo-ggbgx-mongodb-0 primary Running us-central1-c gke-yjtest-default-pool-c51609d3-ss98/10.128.0.46 Oct 10,2023 11:25 UTC+0800

Resources Allocation: COMPONENT DEDICATED CPU(REQUEST/LIMIT) MEMORY(REQUEST/LIMIT) STORAGE-SIZE STORAGE-CLASS mongodb false 100m / 1 102Mi / 1Gi data:5Gi standard-rwo

Images: COMPONENT TYPE IMAGE mongodb mongodb registry.cn-hangzhou.aliyuncs.com/apecloud/mongo:5.0.14

Data Protection: AUTO-BACKUP BACKUP-SCHEDULE TYPE BACKUP-TTL LAST-SCHEDULE RECOVERABLE-TIME Disabled 7d

Show cluster events: kbcli cluster list-events -n ns-ukltji mongo-ggbgx

5. Vscale

kubectl create -f -<<EOF apiVersion: apps.kubeblocks.io/v1alpha1 kind: OpsRequest metadata: generateName: ops-verticalscaling-2c4g- namespace: ns-ukltji spec: clusterRef: mongo-ggbgx type: VerticalScaling verticalScaling:

  1. Pods are deleted and cannot recover
    ➜  ~ k describe cluster mongo-ggbgx  -n ns-ukltji
    Name:         mongo-ggbgx
    Namespace:    ns-ukltji
    Labels:       clusterdefinition.kubeblocks.io/name=mongodb
              clusterversion.kubeblocks.io/name=mongodb-5.0
    Annotations:  kubeblocks.io/ops-request: [{"name":"ops-verticalscaling-2c4g-v77p7","type":"VerticalScaling"}]
              kubeblocks.io/reconcile: 2023-10-10T03:43:56.541431103Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Kind:         Cluster
    Metadata:
    Creation Timestamp:  2023-10-10T03:25:43Z
    Finalizers:
    cluster.kubeblocks.io/finalizer
    Generate Name:  mongo-
    Generation:     3
    Managed Fields:
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:generateName:
        f:labels:
          .:
          f:clusterdefinition.kubeblocks.io/name:
          f:clusterversion.kubeblocks.io/name:
      f:spec:
        .:
        f:affinity:
          .:
          f:podAntiAffinity:
          f:tenancy:
        f:clusterDefinitionRef:
        f:clusterVersionRef:
        f:componentSpecs:
          .:
          k:{"name":"mongodb"}:
            .:
            f:componentDefRef:
            f:monitor:
            f:name:
            f:noCreatePDB:
            f:replicas:
            f:resources:
              .:
              f:limits:
              f:requests:
            f:serviceAccountName:
            f:volumeClaimTemplates:
        f:terminationPolicy:
    Manager:      kubectl-create
    Operation:    Update
    Time:         2023-10-10T03:25:43Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:clusterDefGeneration:
        f:components:
          .:
          f:mongodb:
            .:
            f:consensusSetStatus:
              .:
              f:leader:
                .:
                f:accessMode:
                f:name:
                f:pod:
            f:phase:
            f:podsReady:
        f:conditions:
        f:observedGeneration:
        f:phase:
    Manager:      manager
    Operation:    Update
    Subresource:  status
    Time:         2023-10-10T03:41:11Z
    API Version:  apps.kubeblocks.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubeblocks.io/ops-request:
          f:kubeblocks.io/reconcile:
        f:finalizers:
          .:
          v:"cluster.kubeblocks.io/finalizer":
      f:spec:
        f:componentSpecs:
          k:{"name":"mongodb"}:
            f:classDefRef:
              .:
              f:class:
            f:resources:
              f:limits:
                f:cpu:
                f:memory:
              f:requests:
                f:cpu:
                f:memory:
        f:monitor:
        f:resources:
          .:
          f:cpu:
          f:memory:
        f:storage:
          .:
          f:size:
    Manager:         manager
    Operation:       Update
    Time:            2023-10-10T03:43:56Z
    Resource Version:  1151548
    UID:               c9b3441e-793f-4848-8568-4619bc6620a9
    Spec:
    Affinity:
    Pod Anti Affinity:     Preferred
    Tenancy:               SharedNode
    Cluster Definition Ref:  mongodb
    Cluster Version Ref:     mongodb-5.0
    Component Specs:
    Class Def Ref:
      Class:
    Component Def Ref:  mongodb
    Monitor:            true
    Name:               mongodb
    No Create PDB:      false
    Replicas:           1
    Resources:
      Limits:
        Cpu:     4
        Memory:  4Gi
      Requests:
        Cpu:               4
        Memory:            4Gi
    Service Account Name:  dbname
    Volume Claim Templates:
      Name:  data
      Spec:
        Access Modes:
          ReadWriteOnce
        Resources:
          Requests:
            Storage:         5Gi
        Storage Class Name:  standard-rwo
    Monitor:
    Resources:
    Cpu:     0
    Memory:  0
    Storage:
    Size:              0
    Termination Policy:  WipeOut
    Status:
    Cluster Def Generation:  2
    Components:
    Mongodb:
      Consensus Set Status:
        Leader:
          Access Mode:  None
          Name:
          Pod:          Unknown
      Phase:            Updating
      Pods Ready:       false
    Conditions:
    Last Transition Time:  2023-10-10T03:41:06Z
    Message:               VerticalScaling opsRequest: ops-verticalscaling-2c4g-v77p7 is processing
    Reason:                VerticalScaling
    Status:                False
    Type:                  LatestOpsRequestProcessed
    Last Transition Time:  2023-10-10T03:25:43Z
    Message:               The operator has started the provisioning of Cluster: mongo-ggbgx
    Observed Generation:   3
    Reason:                PreCheckSucceed
    Status:                True
    Type:                  ProvisioningStarted
    Last Transition Time:  2023-10-10T03:25:44Z
    Message:               Successfully applied for resources
    Observed Generation:   3
    Reason:                ApplyResourcesSucceed
    Status:                True
    Type:                  ApplyResources
    Last Transition Time:  2023-10-10T03:41:11Z
    Message:               pods are not ready in Components: [mongodb], refer to related component message in Cluster.status.components
    Reason:                ReplicasNotReady
    Status:                False
    Type:                  ReplicasReady
    Last Transition Time:  2023-10-10T03:41:11Z
    Message:               pods are unavailable in Components: [mongodb], refer to related component message in Cluster.status.components
    Reason:                ComponentsNotReady
    Status:                False
    Type:                  Ready
    Observed Generation:     3
    Phase:                   Updating
    Events:
    Type     Reason                    Age                    From                    Message
    ----     ------                    ----                   ----                    -------
    Normal   ComponentPhaseTransition  18m                    cluster-controller      Create a new component
    Normal   AllReplicasReady          18m                    cluster-controller      all pods of components are ready, waiting for the probe detection successful
    Normal   ClusterReady              18m                    cluster-controller      Cluster: mongo-ggbgx is ready, current phase is Running
    Normal   ComponentPhaseTransition  18m                    cluster-controller      Running: true, PodsReady: true, PodsTimedout: false
    Normal   Running                   18m                    cluster-controller      Cluster: mongo-ggbgx is ready, current phase is Running
    Normal   ApplyResourcesSucceed     3m23s (x2 over 18m)    cluster-controller      Successfully applied for resources
    Normal   PreCheckSucceed           3m23s (x2 over 18m)    cluster-controller      The operator has started the provisioning of Cluster: mongo-ggbgx
    Normal   VerticalScaling           3m23s                  ops-request-controller  Start to process the VerticalScaling opsRequest "ops-verticalscaling-2c4g-v77p7" in Cluster: mongo-ggbgx
    Normal   ComponentPhaseTransition  3m23s                  cluster-controller      Component workload updated
    Normal   WaitingForProbeSuccess    3m23s (x3 over 3m23s)  cluster-controller      Waiting for probe success
    Warning  ReplicasNotReady          3m18s                  cluster-controller      pods are not ready in Components: [mongodb], refer to related component message in Cluster.status.components
    Warning  ComponentsNotReady        3m18s                  cluster-controller      pods are unavailable in Components: [mongodb], refer to related component message in Cluster.status.components
    Warning  FailedCreate              33s (x3 over 2m36s)    event-controller        create Pod mongo-ggbgx-mongodb-0 in StatefulSet mongo-ggbgx-mongodb failed error: pods "mongo-ggbgx-mongodb-0" is forbidden: exceeded quota: quota-ns-ukltji, requested: limits.cpu=4,limits.memory=4Gi, used: limits.cpu=0,limits.memory=0, limited: limits.cpu=2,limits.memory=2Gi
    ➜  ~ k get pod -n ns-ukltji
    No resources found in ns-ukltji namespace.
leon-inf commented 11 months ago

dup with #5375

leon-inf commented 11 months ago

Reopen this issue to track the problem about namespace resource quota.