RamenDR / ramen

Apache License 2.0
73 stars 53 forks source link

RamenDR Pod is in a CrashLoopBackOff state when the VRG does not have a protectedNamespaces field #1439

Open BhargaviEnuturla opened 4 months ago

BhargaviEnuturla commented 4 months ago

After applying the latest RamenDR image, if we create a VRG without the 'protectedNamespaces' field, the RamenDR Pod enters a CrashLoopBackOff state (with the VRG created in the admin namespace).

Once we add the 'protectedNamespaces' field to the VRG created in the admin namespace, the RamenDR Pod transitions to a running state.

This issue needs to be addressed in the RamenDR code to handle the absence of the 'protectedNamespaces' field gracefully. Instead of crashing, it should produce an error message.

Attaching the error messages

E0603 11:19:56.927257 1 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 171 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1b89d60?, 0x32f6ce0}) /go/pkg/mod/k8s.io/apimachinery@v0.29.0/pkg/util/runtime/runtime.go:75 +0x85 k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x0?})

nirs commented 4 months ago

@raghavendra-talur Is missing protectedNamespaces a valid configuration?

nirs commented 4 months ago

@BhargaviEnuturla can you share the VRG that cause this crash?

BhargaviEnuturla commented 4 months ago

@nirs ramenOpsNamespace is the adminnamespace field configured in ramen-dr-config map

nirs commented 4 months ago

@BhargaviEnuturla we really need the vrg that trigger this crash. We need to understand if this is is a valid or invalid configuration. We also need to add a test with similar VRG to ensure it works correctly in the future.

asn1809 commented 4 months ago

@nirs The VRG that was used is similar to one given below:

apiVersion: ramendr.openshift.io/v1alpha1
kind: VolumeReplicationGroup
metadata:
  finalizers:
    - volumereplicationgroups.ramendr.openshift.io/vrg-protection
  name: test-ae1-filebrowser-project1
  namespace: ibm-spectrum-fusion-ns
spec:
  pvcSelector:
    matchExpressions:
      - key: icpdsupport/empty-on-nd-backup
        operator: NotIn
        values:
          - 'true'
      - key: icpdsupport/ignore-on-nd-backup
        operator: NotIn
        values:
          - 'true'
  replicationState: primary
  s3Profiles:
    - site1
    - site2
  sync: {}
  volSync:
    disabled: true

PS: ibm-spectrum-fusion-ns is the admin ns and protectedNamespaces is not added in the spec of VRG.

nirs commented 4 months ago

spec: pvcSelector:

@asn1809 there is no kube object protection?

This looks like standard VRG we use in upstream testing, I wonder why we don't see this issue.

Did you try to reproduce this with drenv?

asn1809 commented 2 months ago

@nirs , sorry, didn't look for long. The reason this is might not being seen in the upstream is checks for drpc being done in https://github.com/RamenDR/ramen/blob/main/internal/controller/drplacementcontrol_controller.go#L2859C3-L2861C4