Open asn1809 opened 1 year ago
Here is the ramen.log that @asn1809 shared with me on Saturday night.
Ramen manager container starts
2023-10-20T11:52:29.058Z INFO setup controllers/ramenconfig.go:62 loading Ramen configuration from {"file": "/config/ramen_manager_config.yaml"}
2023-10-20T11:52:29.059Z INFO setup controllers/ramenconfig.go:70 s3 profile {"key": 0, "value": {"s3ProfileName":"site1","s3Bucket":"isf-minio-site1","s3CompatibleEndpoint":"https://isf-minio-ibm-spectrum-fusion-ns.apps.rackae1.mydomain.com","s3Region":"site1","s3SecretRef":{"name":"isf-minio-site2","namespace":"ibm-spectrum-fusion-ns"}}}
2023-10-20T11:52:29.059Z INFO setup controllers/ramenconfig.go:70 s3 profile {"key": 1, "value": {"s3ProfileName":"site2","s3Bucket":"isf-minio-site2","s3CompatibleEndpoint":"https://isf-minio-ibm-spectrum-fusion-ns.apps.rackae2.mydomain.com","s3Region":"site2","s3SecretRef":{"name":"isf-minio-site2","namespace":"ibm-spectrum-fusion-ns"}}}
I1020 11:52:30.505189 1 request.go:690] Waited for 1.049368431s due to client-side throttling, not priority and fairness, request: GET:https://172.31.0.1:443/apis/tuned.openshift.io/v1?timeout=32s
2023-10-20T11:52:33.561Z INFO controller-runtime.metrics metrics/listener.go:44 Metrics server is starting to listen {"addr": "127.0.0.1:9289"}
2023-10-20T11:52:33.561Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:62 Adding VolumeReplicationGroup controller
2023-10-20T11:52:33.561Z INFO controllers.VolumeReplicationGroup controllers/ramenconfig.go:86 loading Ramen config file {"name": "/config/ramen_manager_config.yaml"}
2023-10-20T11:52:33.562Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:101 VolSync disabled; don't own volsync resources
2023-10-20T11:52:33.562Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:110 Kube object protection disabled; don't watch kube objects requests
2023-10-20T11:52:33.562Z INFO setup workspace/main.go:213 starting manager
2023-10-20T11:52:33.562Z INFO manager/internal.go:369 Starting server {"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:9289"}
2023-10-20T11:52:33.562Z INFO manager/internal.go:369 Starting server {"kind": "health probe", "addr": "[::]:8081"}
I1020 11:52:34.663115 1 leaderelection.go:248] attempting to acquire leader lease ibm-spectrum-fusion-ns/dr-cluster.ramendr.openshift.io...
I1020 11:52:56.000645 1 leaderelection.go:258] successfully acquired lease ibm-spectrum-fusion-ns/dr-cluster.ramendr.openshift.io
2023-10-20T11:52:56.000Z DEBUG events recorder/recorder.go:103 ramen-dr-cluster-operator-797c68655f-mcb4l_8456dc4e-a7c0-41cc-8b90-857af0398e42 became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"ibm-spectrum-fusion-ns","name":"dr-cluster.ramendr.openshift.io","uid":"0c698d55-d2ee-4642-965b-4acbda332155","apiVersion":"coordination.k8s.io/v1","resourceVersion":"14033688"}, "reason": "LeaderElection"}
2023-10-20T11:52:56.000Z INFO controller/controller.go:186 Starting EventSource {"controller": "protectedvolumereplicationgrouplist", "controllerGroup": "ramendr.openshift.io", "controllerKind": "ProtectedVolumeReplicationGroupList", "source": "kind source: *v1alpha1.ProtectedVolumeReplicationGroupList"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:194 Starting Controller {"controller": "protectedvolumereplicationgrouplist", "controllerGroup": "ramendr.openshift.io", "controllerKind": "ProtectedVolumeReplicationGroupList"}
2023-10-20T11:52:56.000Z INFO controller/controller.go:186 Starting EventSource {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "source": "kind source: *v1alpha1.VolumeReplicationGroup"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:186 Starting EventSource {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "source": "kind source: *v1.PersistentVolumeClaim"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:186 Starting EventSource {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "source": "kind source: *v1.PersistentVolumeClaim"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:186 Starting EventSource {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "source": "kind source: *v1.ConfigMap"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:186 Starting EventSource {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "source": "kind source: *v1alpha1.VolumeReplication"}
2023-10-20T11:52:56.001Z INFO controller/controller.go:194 Starting Controller {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup"}
8 PVCs are created
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T11:52:56.103Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
More Ramen controllers start and its config is updated
2023-10-20T11:52:58.854Z INFO controller/controller.go:228 Starting workers {"controller": "protectedvolumereplicationgrouplist", "controllerGroup": "ramendr.openshift.io", "controllerKind": "ProtectedVolumeReplicationGroupList", "worker count": 1}
2023-10-20T11:52:58.855Z INFO controller/controller.go:228 Starting workers {"controller": "volumereplicationgroup", "controllerGroup": "ramendr.openshift.io", "controllerKind": "VolumeReplicationGroup", "worker count": 1}
2023-10-20T11:52:58.855Z INFO configmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:137 Update in ramen-dr-cluster-operator-config configuration map
VRG blr-maj/br-maj
reconcile starts
2023-10-20T17:31:45.015Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:405 Entering reconcile loop {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83"}
2023-10-20T17:31:45.021Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:537 Recipe {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "elements": {"PvcSelector":{"LabelSelector":{},"NamespaceNames":["blr-maj"]},"CaptureWorkflow":null,"RecoverWorkflow":null}}
2023-10-20T17:31:45.021Z INFO controllers.VolumeReplicationGroup util/pvcs_util.go:61 Fetching PersistentVolumeClaims {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "pvcSelector": ""}
2023-10-20T17:31:45.021Z INFO controllers.VolumeReplicationGroup util/pvcs_util.go:76 Found 8 PVCs using label selector {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83"}
2023-10-20T17:31:45.021Z INFO controllers.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:666 Found PersistentVolumeClaims {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "count": 0}
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:870 Entering processing VolumeReplicationGroup as Primary {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
ClusterDataReady
is false, so PVs and PVCs are restored from S3. There are 1 of each. PVC is named blr-maj/filebrowser-pvc
.
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/volumereplicationgroup_controller.go:610 ClusterDataReady condition {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary", "status": "Unknown", "reason": "Initializing", "message": "Initializing VolumeReplicationGroup", "observedGeneration": 1, "generation": 1}
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volsync.go:18 VolSync: Restoring VolSync PVs {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volsync.go:21 No RDSpec entries. There are no PVCs to restore {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1796 Restoring VolRep PVs and PVCs {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.025Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1806 Restoring PVs and PVCs to this managed cluster. ProfileList: [site1 site2] {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.164Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1889 Found 1 PVs in s3 store using profile site1 {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.168Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:2006 Restored 1 PV for VolRep {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.172Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1910 Found 1 PVCs in s3 store using profile site1 {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.179Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:172 Create event for PersistentVolumeClaim
2023-10-20T17:31:45.179Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:297 Found VolumeReplicationGroup with matching labels {"pvc": "blr-maj/filebrowser-pvc", "vrg": "blr-maj", "labeled": ""}
2023-10-20T17:31:45.179Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:2006 Restored 1 PVC for VolRep {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
2023-10-20T17:31:45.179Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1867 Restored 1 PVs and 1 PVCs using profile site1 {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary"}
KubeObjectProtection
is disabled in config map and VRG. Some PVC update events enqueue VRG to be reconciled again.
2023-10-20T17:31:45.179Z INFO controllers.VolumeReplicationGroup.vrginstance controllers/vrg_kubeobjects.go:657 Kube object protection {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary", "disabled": true, "VRG": true, "configMap": true, "for": "recovery"}
2023-10-20T17:31:45.254Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:191 Update event for PersistentVolumeClaim
2023-10-20T17:31:45.254Z INFO RDPredicate.RD controllers/volumereplicationgroup_controller.go:323 Failed to deep copy older MCV
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/vrg_volrep.go:368 Skipping handling of VR as PersistentVolumeClaim is not bound {"pvc": "blr-maj/filebrowser-pvc", "pvcPhase": "Pending"}
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:254 Not Requeuing {"pvc": "blr-maj/filebrowser-pvc", "oldPVC Phase": "Pending", "newPVC phase": "Pending"}
2023-10-20T17:31:45.255Z INFO RDPredicate.RD controllers/volumereplicationgroup_controller.go:323 Failed to deep copy older MCV
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:191 Update event for PersistentVolumeClaim
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:226 Reconciling due to phase change {"pvc": "blr-maj/filebrowser-pvc", "oldPhase": "Pending", "newPhase": "Bound"}
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:297 Found VolumeReplicationGroup with matching labels {"pvc": "blr-maj/filebrowser-pvc", "vrg": "blr-maj", "labeled": ""}
2023-10-20T17:31:45.255Z INFO pvcmap.VolumeReplicationGroup controllers/volumereplicationgroup_controller.go:297 Found VolumeReplicationGroup with matching labels {"pvc": "blr-maj/filebrowser-pvc", "vrg": "blr-maj", "labeled": ""}
VRG controller tries to annotate the PVC blr-maj/filebrowser-pvc
that was just restored with volumereplicationgroups.ramendr.openshift.io/vr-archived: archiveV1-<PVC Generation Number>
, but it fails because the provided PVC UID does not matche the expected one.
2023-10-20T17:31:45.783Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1746 Failed to update PersistentVolumeClaim annotation {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary", "pvc": "blr-maj/filebrowser-pvc", "error": "Operation cannot be fulfilled on persistentvolumeclaims \"filebrowser-pvc\": StorageError: invalid object, Code: 4, Key: /kubernetes.io/persistentvolumeclaims/blr-maj/filebrowser-pvc, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 7d5ea473-aeb8-4155-848e-9243ff01534c, UID in object meta: bbfd7c8d-7e4d-4004-9e1c-2697c86ce167"}
github.com/ramendr/ramen/controllers.(*VRGInstance).addArchivedAnnotationForPVC
/workspace/controllers/vrg_volrep.go:1746
github.com/ramendr/ramen/controllers.(*VRGInstance).uploadPVandPVCtoS3Stores
/workspace/controllers/vrg_volrep.go:570
github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVolRepsAsPrimary
/workspace/controllers/vrg_volrep.go:74
github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileAsPrimary
/workspace/controllers/volumereplicationgroup_controller.go:902
github.com/ramendr/ramen/controllers.(*VRGInstance).processAsPrimary
/workspace/controllers/volumereplicationgroup_controller.go:879
github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG
/workspace/controllers/volumereplicationgroup_controller.go:558
github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile
/workspace/controllers/volumereplicationgroup_controller.go:455
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235
When a PVC is created it gets a new UID. The VRG controller gets the PVC from S3 with the UID from the other cluster and stores it in VRGInstance.volRepPVCs
:
This same PVC is updated with the annotation and then submitted to the API server:
https://github.com/RamenDR/ramen/blob/f6afc6fed2cfa62c93778f5296a4cf1bb96a1325/controllers/vrg_volrep.go#L32-L34 https://github.com/RamenDR/ramen/blob/f6afc6fed2cfa62c93778f5296a4cf1bb96a1325/controllers/vrg_volrep.go#L74 https://github.com/RamenDR/ramen/blob/f6afc6fed2cfa62c93778f5296a4cf1bb96a1325/controllers/vrg_volrep.go#L532 https://github.com/RamenDR/ramen/blob/f6afc6fed2cfa62c93778f5296a4cf1bb96a1325/controllers/vrg_volrep.go#L569
It seems the PVC needs to be read from the API server to get its new UID so it can be updated, or perhaps the UID could be reset. It is not obvious to me why Fusion is encountering this issue and it has not been discovered previously.
@asn1809 presuming you can reproduce this issue, will you please try to do so with the main
branch to determine whether this is an issue with PR #1090 ?
This commit seems to be the issue? This seems to be changing the work on reference to a copy, which in turn does not gain the changes made by cleanupForRestore (which removes the UID and hence does not use the stale one from the s3 store).
This could also be due to the change that ramen restores the PVCs, which was not the case before, and hence causes the UID mismatch. Checking with @raghavendra-talur further on its working with Ceph backends, and why this issue does not crop up.
@asn1809 Thank you for discovering and reporting this issue. It should be fixed. However, I wonder what the impact was? From the log it seems to have recovered on the next reconcile. Did the application recover successfully?
In the VRG, for the condition ClusterDataProtected, below is the error seen:
failed to add archived annotation for PVC (blr-trinity/filebrowser-pvc) with error (failed to update PersistentVolumeClaim (blr-trinity/filebrowser-pvc) annotation (volumereplicationgroups.ramendr.openshift.io/vr-archived) belonging toVolumeReplicationGroup (blr-trinity/blr-trinity), Operation cannot be fulfilled on persistentvolumeclaims "filebrowser-pvc": StorageError: invalid object, Code: 4, Key: /kubernetes.io/persistentvolumeclaims/blr-trinity/filebrowser-pvc, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 0276d756-4bac-4743-bd80-725eb8114cc8, UID in object meta: 3e978ee0-3058-46dc-a080-ac25d72a5ec2)
@asn1809 Thank you for discovering and reporting this issue. It should be fixed. However, I wonder what the impact was? From the log it seems to have recovered on the next reconcile. Did the application recover successfully?
The impact to fusion is even though there might be recovery as you mentioned, it is not properly reflected in VRG and there by in the Application CR and UI reading it.
From @pdumbre this morning. 4 protected PVCs with same name, none with namespace, and generation is 1. One theory is that the unconditional append on restore is doing it. Maybe restore fails 3 times, each time after adding the PVC from status.
apiVersion: ramendr.openshift.io/v1alpha1
kind: VolumeReplicationGroup
metadata:
creationTimestamp: '2023-11-01T10:56:17Z'
finalizers:
- volumereplicationgroups.ramendr.openshift.io/vrg-protection
generation: 1
managedFields:
- apiVersion: ramendr.openshift.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
'f:metadata':
'f:finalizers':
.: {}
'v:"volumereplicationgroups.ramendr.openshift.io/vrg-protection"': {}
'f:spec':
.: {}
'f:pvcSelector': {}
'f:replicationState': {}
'f:s3Profiles': {}
'f:sync': {}
'f:volSync':
.: {}
'f:disabled': {}
manager: Mozilla
operation: Update
time: '2023-11-01T10:56:17Z'
- apiVersion: ramendr.openshift.io/v1alpha1
fieldsType: FieldsV1
fieldsV1:
'f:status':
.: {}
'f:conditions': {}
'f:kubeObjectProtection': {}
'f:lastUpdateTime': {}
'f:observedGeneration': {}
'f:protectedPVCs': {}
'f:state': {}
manager: manager
operation: Update
subresource: status
time: '2023-11-02T12:21:43Z'
name: shio
namespace: shio
resourceVersion: '8460539'
uid: 1e6e2579-80c5-4154-bb6f-5fd4777aecbe
spec:
pvcSelector: {}
replicationState: primary
s3Profiles:
- site2
- site1
sync: {}
volSync:
disabled: true
status:
conditions:
- lastTransitionTime: '2023-11-01T10:56:18Z'
message: PVCs in the VolumeReplicationGroup are ready for use
observedGeneration: 1
reason: Ready
status: 'True'
type: DataReady
- lastTransitionTime: '2023-11-01T10:56:18Z'
message: VolumeReplicationGroup is replicating
observedGeneration: 1
reason: Replicating
status: 'False'
type: DataProtected
- lastTransitionTime: '2023-11-01T10:56:17Z'
message: Restored cluster data
observedGeneration: 1
reason: Restored
status: 'True'
type: ClusterDataReady
- lastTransitionTime: '2023-11-01T10:56:18Z'
message: Cluster data of one or more PVs are in the process of being protected
observedGeneration: 1
reason: Uploading
status: 'False'
type: ClusterDataProtected
kubeObjectProtection: {}
lastUpdateTime: '2023-11-02T12:21:43Z'
observedGeneration: 1
protectedPVCs:
- conditions:
- lastTransitionTime: '2023-11-01T10:56:17Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Ready
status: 'True'
type: DataReady
- lastTransitionTime: '2023-11-01T10:56:17Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Replicating
status: 'False'
type: DataProtected
name: br-pvc
resources: {}
- conditions:
- lastTransitionTime: '2023-11-02T03:05:54Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Ready
status: 'True'
type: DataReady
- lastTransitionTime: '2023-11-02T03:05:54Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Replicating
status: 'False'
type: DataProtected
name: br-pvc
resources: {}
- conditions:
- lastTransitionTime: '2023-11-02T05:05:35Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Ready
status: 'True'
type: DataReady
- lastTransitionTime: '2023-11-02T05:05:35Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Replicating
status: 'False'
type: DataProtected
name: br-pvc
resources: {}
- conditions:
- lastTransitionTime: '2023-11-02T12:21:43Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Ready
status: 'True'
type: DataReady
- lastTransitionTime: '2023-11-02T12:21:43Z'
message: PVC in the VolumeReplicationGroup is ready for use
observedGeneration: 1
reason: Replicating
status: 'False'
type: DataProtected
name: br-pvc
resources: {}
state: Primary
@asn1809 Please add these two list type and map lines to the volumereplicationgroups_types.go
file directly above the ProtectedPVCs slice definition:
//+listType=map
//+listMapKey=name
// All the protected pvcs
ProtectedPVCs []ProtectedPVC `json:"protectedPVCs,omitempty"`
Then run make manifests
to generate a new CRD yaml file. This leverages the API server admission control to effectively treat the slice as a map and prevent more than one entry with the same name. It should get us to the reconcile when the first duplicate entry is attempted to be added. This patch is not meant to be promoted since the multi-namespace feature allows duplicate PVCs with the same name as long as they are in different namespaces. It would require the namespace and name fields be joined.
With multi-namespace-1053 branch, issues are seen in the failover operation with the below error message:
2023-10-20T17:31:45.783Z ERROR controllers.VolumeReplicationGroup.vrginstance controllers/vrg_volrep.go:1746 Failed to update PersistentVolumeClaim annotation {"VolumeReplicationGroup": "blr-maj/blr-maj", "rid": "5dfe0f02-5483-43f8-a6d8-6c519f883d83", "State": "primary", "pvc": "blr-maj/filebrowser-pvc", "error": "Operation cannot be fulfilled on persistentvolumeclaims \"filebrowser-pvc\": StorageError: invalid object, Code: 4, Key: /kubernetes.io/persistentvolumeclaims/blr-maj/filebrowser-pvc, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 7d5ea473-aeb8-4155-848e-9243ff01534c, UID in object meta: bbfd7c8d-7e4d-4004-9e1c-2697c86ce167"} github.com/ramendr/ramen/controllers.(*VRGInstance).addArchivedAnnotationForPVC /workspace/controllers/vrg_volrep.go:1746 github.com/ramendr/ramen/controllers.(*VRGInstance).uploadPVandPVCtoS3Stores /workspace/controllers/vrg_volrep.go:570 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileVolRepsAsPrimary /workspace/controllers/vrg_volrep.go:74 github.com/ramendr/ramen/controllers.(*VRGInstance).reconcileAsPrimary /workspace/controllers/volumereplicationgroup_controller.go:902 github.com/ramendr/ramen/controllers.(*VRGInstance).processAsPrimary /workspace/controllers/volumereplicationgroup_controller.go:879 github.com/ramendr/ramen/controllers.(*VRGInstance).processVRG /workspace/controllers/volumereplicationgroup_controller.go:558 github.com/ramendr/ramen/controllers.(*VolumeReplicationGroupReconciler).Reconcile /workspace/controllers/volumereplicationgroup_controller.go:455 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:122 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:323 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:274 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2 /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.6/pkg/internal/controller/controller.go:235
@hatfieldbrian As discussed, can you please check and help us resolve the issue.
Ramen docker image was build by using below details: UPSTREAM_RAMEN_REPO=https://github.com/hatfieldbrian/ramen.git GIT_TAG=multi-namespace-1053 COMMMIT_ID=ab681c935abbbc09297f7dc2423d85fb2328d635