IBM / ibm-spectrum-scale-csi

The IBM Spectrum Scale Container Storage Interface (CSI) project enables container orchestrators, such as Kubernetes and OpenShift, to manage the life-cycle of persistent storage.
Apache License 2.0
66 stars 49 forks source link

Similar type of events not getting generated when configmap is changed #1001

Open saurabhwani5 opened 1 year ago

saurabhwani5 commented 1 year ago

Describe the bug

Create and apply configmap with wrong values which will show event in cso and then apply the right configmap which won't give any ValidationWarning event as everything is correct but when wrong configmap is applied again it won't give any ValidationWarning event in configmap which is not expected

How to Reproduce?

For this issue, I have used Images in PR1000 ( #1000 ).

  1. Install CSI 2.10.0 with the following Images

    [root@saurabh6-master pr1000]# oc get pods
    NAME                                                  READY   STATUS    RESTARTS       AGE
    ibm-spectrum-scale-csi-79pp5                          3/3     Running   0              2m59s
    ibm-spectrum-scale-csi-attacher-b6b6d4948-l8kmw       1/1     Running   2 (3m2s ago)   3m42s
    ibm-spectrum-scale-csi-attacher-b6b6d4948-prgwj       1/1     Running   2 (3m2s ago)   3m42s
    ibm-spectrum-scale-csi-operator-6877d5465c-szr95      1/1     Running   0              3m46s
    ibm-spectrum-scale-csi-provisioner-b456fbb49-xxxkt    1/1     Running   2 (3m1s ago)   3m42s
    ibm-spectrum-scale-csi-resizer-84d84bfdf6-8zlm2       1/1     Running   2 (3m1s ago)   3m42s
    ibm-spectrum-scale-csi-snapshotter-656d4bd64f-s9zzq   1/1     Running   2 (3m1s ago)   3m42s
    ibm-spectrum-scale-csi-vhdmf                          3/3     Running   0              2m59s
    [root@saurabh6-master pr1000]# oc get cso
    NAME                     VERSION   SUCCESS
    ibm-spectrum-scale-csi   2.10.0    True
    [root@saurabh6-master pr1000]# oc describe pod | grep quay
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:31172dc13f5cc514cf2474cb440697c1da20d035f3fd12ec761f12b06cc2e0a7
    Normal   Pulled     3m9s  kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494" already present on machine
    Image:         quay.io/badri_pathak/ibm-spectrum-scale-csi-operator:events_gen_v11
    Image ID:      quay.io/badri_pathak/ibm-spectrum-scale-csi-operator@sha256:cb12d4adec4321bc9f4f6091e698d91b70df3d043e31592b39c3686084a1a836
      CSI_DRIVER_IMAGE:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
    Normal  Pulled     3m55s  kubelet            Container image "quay.io/badri_pathak/ibm-spectrum-scale-csi-operator:events_gen_v11" already present on machine
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:31172dc13f5cc514cf2474cb440697c1da20d035f3fd12ec761f12b06cc2e0a7
    Normal   Pulled     3m9s  kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494" already present on machine
    [root@saurabh6-master pr1000]#
  2. Apply wrong configmap as shown below

    [root@saurabh6-master pr1000]# cat wrong_cm.yaml
    kind: ConfigMap
    apiVersion: v1
    metadata:
    name: ibm-spectrum-scale-csi-config
    namespace: ibm-spectrum-scale-csi-driver
    data:
    VAR_DRIVER_LOGLEVEL: debug0s
    VAR_DRIVER_PERSISTENT_LO_G: ENABLED
    VAR_DRIVER_VOLUME_STATS_C_APABILITY: DISABLED
    VAR_DRIVER_NODEPUBLISH_METHOD: symlink
    DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
    [root@saurabh6-master pr1000]# oc apply -f wrong_cm.yaml
    configmap/ibm-spectrum-scale-csi-config created
  3. Check the cso event

    Events:
    Type     Reason             Age                  From              Message
    ----     ------             ----                 ----              -------
    Warning  CreateDirFailed    5m13s (x11 over 8m)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
    Warning  ValidationWarning  20s                  CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
    Warning  UpdateFailed       16s (x2 over 17s)    CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
    Normal   CSIConfigured      14s (x12 over 4m3s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully

    Above ValidationWarning event is getting generated which is expected

  4. Apply correct configmap

    [root@saurabh6-master pr1000]# cat correct_cm.yaml
    kind: ConfigMap
    apiVersion: v1
    metadata:
    name: ibm-spectrum-scale-csi-config
    namespace: ibm-spectrum-scale-csi-driver
    data:
    VAR_DRIVER_LOGLEVEL: DEBUG
    VAR_DRIVER_PERSISTENT_LOG: ENABLED
    VAR_DRIVER_VOLUME_STATS_CAPABILITY: DISABLED
    VAR_DRIVER_NODEPUBLISH_METHOD: SYMLINK
    DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
    [root@saurabh6-master pr1000]# oc apply -f correct_cm.yaml
    configmap/ibm-spectrum-scale-csi-config configured
  5. Check the CSO event (It won't show any new event as everything is right)

    Events:
    Type     Reason             Age                     From              Message
    ----     ------             ----                    ----              -------
    Warning  CreateDirFailed    6m41s (x11 over 9m28s)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
    Warning  ValidationWarning  108s                    CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
    Warning  UpdateFailed       21s                     CSIScaleOperator  Synchronization of node/driver ibm-spectrum-scale-csi DaemonSet failed for the CSISCaleOperator instance ibm-spectrum-scale-csi
    Warning  UpdateFailed       19s (x4 over 105s)      CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
    Normal   CSIConfigured      15s (x20 over 5m31s)    CSIScaleOperator  The CSI driver resources have been created/updated successfully

    As cm was correct no ValidationWarning event is generated after applying correct cm which is expected

  6. Reapply the wrong configmap and check if cso event is shown or not.

    [root@saurabh6-master pr1000]# cat wrong_cm.yaml
    kind: ConfigMap
    apiVersion: v1
    metadata:
    name: ibm-spectrum-scale-csi-config
    namespace: ibm-spectrum-scale-csi-driver
    data:
    VAR_DRIVER_LOGLEVEL: debug0s
    VAR_DRIVER_PERSISTENT_LO_G: ENABLED
    VAR_DRIVER_VOLUME_STATS_C_APABILITY: DISABLED
    VAR_DRIVER_NODEPUBLISH_METHOD: symlink
    DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
    [root@saurabh6-master pr1000]# oc apply -f wrong_cm.yaml
    configmap/ibm-spectrum-scale-csi-config configured
  7. Check the CSO status:

    Events:
    Type     Reason             Age                   From              Message
    ----     ------             ----                  ----              -------
    Warning  CreateDirFailed    8m18s (x11 over 11m)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
    Warning  ValidationWarning  3m25s                 CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
    Warning  UpdateFailed       118s                  CSIScaleOperator  Synchronization of node/driver ibm-spectrum-scale-csi DaemonSet failed for the CSISCaleOperator instance ibm-spectrum-scale-csi
    Warning  UpdateFailed       116s (x4 over 3m22s)  CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
    Normal   CSIConfigured      112s (x20 over 7m8s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully

    No ValidationWarning event is shown for wrong cm. If we check the event last ValidationWarning event was 3m25s old and after applying wrong cm recently it is not showing anything.

Note: This issue is seen when wrong cm is deleted and applied again.

Expected behavior

ValidationWarning Event should be generated if wrong cm is applied and same for other failure cases if that is not working

saurabhwani5 commented 1 year ago

Another scenario is seen as follows:

  1. Apply cm having wrong key , it will generate cso event
  2. Apply cm having wrong value , it will generate cso event
  3. Now apply cm having wrong key in one parameter and wrong value which has correct key , it won't generate cso event which is not expected
badri-pathak commented 1 year ago

@saurabhwani5 I have validated the events with various scenario along with the above mentioned ones. My observation is that Kubernetes suppress similar events whenever cso tries to adding events too frequently. The events details are visible for certain time-period but can be shown after sometime when new events gets generate. The failure values will gets increase with the hidden counts also. e.g. Warning ValidationWarning 9s (x12 over 4h55m) CSIScaleOperator There are few entries having wrong key which will not be processed or few entries having wrong values in the configmap ibm-spectrum-scale-csi-config, check operator logs for details

In the above example, after 7-8 counts the events were not visible but tried after some time, its gets new failure events with totalm failed with x12 which is total failed from initial time.

badri-pathak commented 1 year ago

@amdabhad The same behaviour can be noticed with generic message along with unique message when keep on generating events too frequently. I think there won't be any changes required as of now.

amdabhad commented 1 year ago

@badri-pathak , please check on 2 things:

  1. This is the event generated repeatedly, see if this can be reduced if it is the last event:
    Normal   CSIConfigured      112s (x20 over 7m8s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully
  2. See if there is any official k8s doc mentioning about surpassing frequent events, we may have to document that