Closed anandhg02 closed 1 month ago
hi @anandhg02, can you share more details of the storage class used for replication and full result of the command oc describe rg rg-5b85f807-2fb2-46a2-af2f-7e3c4541f81d
for RG.
Thanks.
Hello Rajshree, Just for testing I had deleted the previous rg and recreated with the new replication and still getting the same error
The rg that I am currently testing the replication operation is rg-47d46288-c551-4e73-b8a9-b41113248b3f.
[corood@csahn01 repctl]$ ./repctl get rg
[2024-09-09 22:25:56] INFO listing replication groups
[2024-09-09 22:25:56] INFO Cluster: ocps1
+-----+
| RG |
+-----+
Name State rClusterID Driver RemoteRG IsSource LinkState
rg-47d46288-c551-4e73-b8a9-b41113248b3f Error ocps2 csi-powermax.dellemc.com rg-47d46288-c551-4e73-b8a9-b41113248b3f true SYNCHRONIZED
rg-790a9d36-b593-4936-869d-317eb56018b0 Error ocps2 csi-isilon.dellemc.com rg-790a9d36-b593-4936-869d-317eb56018b0 false FAILEDOVER
rg-ca6cc10f-1ac4-43a5-a673-cdeff8a45a17 Ready ocps2 csi-powermax.dellemc.com rg-ca6cc10f-1ac4-43a5-a673-cdeff8a45a17 true SYNCHRONIZED
[2024-09-09 22:25:56] INFO
[2024-09-09 22:25:56] INFO Cluster: ocps2
+-----+
| RG |
+-----+
Name State rClusterID Driver RemoteRG IsSource LinkState
rg-47d46288-c551-4e73-b8a9-b41113248b3f Error ocps1 csi-powermax.dellemc.com rg-47d46288-c551-4e73-b8a9-b41113248b3f false SYNCHRONIZED
rg-790a9d36-b593-4936-869d-317eb56018b0 Error ocps1 csi-isilon.dellemc.com rg-790a9d36-b593-4936-869d-317eb56018b0 true FAILEDOVER
rg-ca6cc10f-1ac4-43a5-a673-cdeff8a45a17 Ready ocps1 csi-powermax.dellemc.com rg-ca6cc10f-1ac4-43a5-a673-cdeff8a45a17 false SYNCHRONIZED
Requested output for the command oc describe rg
[corood@csahn01 repctl]$ oc describe rg rg-47d46288-c551-4e73-b8a9-b41113248b3f
Name: rg-47d46288-c551-4e73-b8a9-b41113248b3f
Namespace:
Labels: replication.storage.dell.com/RdfGroup=12
replication.storage.dell.com/RdfMode=SYNC
replication.storage.dell.com/RemoteRDFGroup=12
replication.storage.dell.com/RemoteSYMID=000220002171
replication.storage.dell.com/SYMID=000220002131
replication.storage.dell.com/driverName=csi-powermax.dellemc.com
replication.storage.dell.com/remoteClusterID=ocps2
Annotations: Action:
{"name":"REPROTECT_LOCAL","completed":true,"finalError":"rpc error: code = InvalidArgument desc = missing globalID in protection group att...
replication.storage.dell.com/actionProcessedTime:
replication.storage.dell.com/contextPrefix: powermax
replication.storage.dell.com/remoteClusterID: ocps2
replication.storage.dell.com/remoteRGRetentionPolicy: delete
replication.storage.dell.com/remoteReplicationGroupName: rg-47d46288-c551-4e73-b8a9-b41113248b3f
replication.storage.dell.com/rg_sync_complete: yes
API Version: replication.storage.dell.com/v1
Kind: DellCSIReplicationGroup
Metadata:
Creation Timestamp: 2024-09-01T00:38:47Z
Finalizers:
replication.storage.dell.com/replicationProtection
replication.storage.dell.com/replicationSyncProtection
Generation: 5
Managed Fields:
API Version: replication.storage.dell.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
f:replication.storage.dell.com/remoteReplicationGroupName:
f:replication.storage.dell.com/rg_sync_complete:
f:finalizers:
v:"replication.storage.dell.com/replicationSyncProtection":
Manager: dell-replication-controller
Operation: Update
Time: 2024-09-01T00:38:47Z
API Version: replication.storage.dell.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:Action:
f:replication.storage.dell.com/actionProcessedTime:
f:replication.storage.dell.com/contextPrefix:
f:replication.storage.dell.com/remoteClusterID:
f:replication.storage.dell.com/remoteRGRetentionPolicy:
f:finalizers:
.:
v:"replication.storage.dell.com/replicationProtection":
f:labels:
.:
f:replication.storage.dell.com/RdfGroup:
f:replication.storage.dell.com/RdfMode:
f:replication.storage.dell.com/RemoteRDFGroup:
f:replication.storage.dell.com/RemoteSYMID:
f:replication.storage.dell.com/SYMID:
f:replication.storage.dell.com/driverName:
f:replication.storage.dell.com/remoteClusterID:
f:spec:
.:
f:action:
f:driverName:
f:protectionGroupAttributes:
.:
f:powermax/RdfGroup:
f:powermax/RdfMode:
f:powermax/RemoteRDFGroup:
f:powermax/RemoteSYMID:
f:powermax/SYMID:
f:protectionGroupId:
f:remoteClusterId:
f:remoteProtectionGroupAttributes:
.:
f:powermax/RdfGroup:
f:powermax/RdfMode:
f:powermax/RemoteRDFGroup:
f:powermax/RemoteSYMID:
f:powermax/SYMID:
f:remoteProtectionGroupId:
Manager: dell-csi-replicator
Operation: Update
Time: 2024-09-09T14:24:41Z
API Version: replication.storage.dell.com/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:conditions:
f:lastAction:
.:
f:condition:
f:errorMessage:
f:firstFailure:
f:time:
f:replicationLinkState:
.:
f:isSource:
f:lastSuccessfulUpdate:
f:state:
f:state:
Manager: dell-csi-replicator
Operation: Update
Subresource: status
Time: 2024-09-09T14:28:02Z
Resource Version: 23465070
UID: 09559cd0-5976-4954-947b-946821d25921
Spec:
Action:
Driver Name: csi-powermax.dellemc.com
Protection Group Attributes:
powermax/RdfGroup: 12
powermax/RdfMode: SYNC
powermax/RemoteRDFGroup: 12
powermax/RemoteSYMID: 000220002171
powermax/SYMID: 000220002131
Protection Group Id: csi-rep-sg-postgres-sts-12-SYNC
Remote Cluster Id: ocps2
Remote Protection Group Attributes:
powermax/RdfGroup: 12
powermax/RdfMode: SYNC
powermax/RemoteRDFGroup: 12
powermax/RemoteSYMID: 000220002131
powermax/SYMID: 000220002171
Remote Protection Group Id: csi-rep-sg-postgres-sts-12-SYNC
Status:
Conditions:
Condition: Replication Link State:IsSource changed from (false) to (true)
Time: 2024-09-09T14:25:02Z
Condition: Action REPROTECT_LOCAL failed with error rpc error: code = InvalidArgument desc = missing globalID in protection group attributes
Time: 2024-09-09T14:24:41Z
Condition: Replication Link State:IsSource changed from (true) to (false)
Time: 2024-09-09T14:20:02Z
Condition: Action FAILOVER_REMOTE failed with error rpc error: code = InvalidArgument desc = can't find `systemName` parameter in replication group
Time: 2024-09-09T14:19:33Z
Condition: Replication Link State:IsSource changed from (false) to (true)
Time: 2024-09-01T00:39:01Z
Last Action:
Condition: Action REPROTECT_LOCAL failed with error rpc error: code = InvalidArgument desc = missing globalID in protection group attributes
Error Message: rpc error: code = InvalidArgument desc = missing globalID in protection group attributes
First Failure: 2024-09-09T14:24:41Z
Time: 2024-09-09T14:24:41Z
Replication Link State:
Is Source: true
Last Successful Update: 2024-09-09T14:28:02Z
State: SYNCHRONIZED
State: Error
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Error 9m1s dell-csi-replicator Action [FAILOVER_REMOTE] on DellCSIReplicationGroup [rg-47d46288-c551-4e73-b8a9-b41113248b3f] failed with error [rpc error: code = InvalidArgument desc = can't find `systemName` parameter in replication group]
Warning Error 9m1s dell-csi-replicator Action [FAILOVER_REMOTE] on DellCSIReplicationGroup [rg-47d46288-c551-4e73-b8a9-b41113248b3f] failed with error [rpc error: code = InvalidArgument desc = missing globalID in protection group attributes]
Warning Updated 3m53s (x8 over 9m1s) dell-replication-controller failed to process the last action Action FAILOVER_REMOTE failed with error rpc error: code = InvalidArgument desc = can't find `systemName` parameter in replication group
Warning Error 3m53s dell-csi-replicator Action [REPROTECT_LOCAL] on DellCSIReplicationGroup [rg-47d46288-c551-4e73-b8a9-b41113248b3f] failed with error [rpc error: code = InvalidArgument desc = missing globalID in protection group attributes]
Warning Error 3m53s dell-csi-replicator Action [REPROTECT_LOCAL] on DellCSIReplicationGroup [rg-47d46288-c551-4e73-b8a9-b41113248b3f] failed with error [rpc error: code = InvalidArgument desc = can't find `systemName` parameter in replication group]
Warning Updated 32s (x5 over 3m53s) dell-replication-controller failed to process the last action Action REPROTECT_LOCAL failed with error rpc error: code = InvalidArgument desc = missing globalID in protection group attributes
[corood@csahn01 repctl]$
Hi @anandhg02 : Do you also have powerstore driver installed with replication enabled?
The errors that you see are actually from different drivers. For instance, 'can't find systemName
parameter' is from isilon and 'missing globalID in protection group' is from powerstore.
We have not tested the replication module with multiple drivers installed. I would suggest installing only the powermax driver and test. Thanks!
Hi @santhoshatdell ,
Yes we do have PowerMAX/PowerStore/PowerScale drivers installed in this OCP cluster. The business requirement for this OCP cluster is to provision PVs from multiple storage tiers: Tier1 from PowerMAX, Tier2 from PowerStore, Tier3 from PowerScale.
But I am curious, why should PowerStore & PowerScale errors report under an RG that uses the csi-powermax.dellemc.com driver. And this error appears when I perform Failover/Reprotect on the RG PowerMAX. While setting up a new RG, there are no errors reported.
Our initial investigation pointed out that the replicator side car in each of the installed driver pods might process the same RG which leads to this. I mean that RGs of other drivers are not ignored.
I think I don't have a choice but to use all the 3 drivers (PowerMax/PowerStore/PowerScale) in the same OCP cluster for provisioning across multiple storage tiers.
>>> For instance, 'can't find systemName parameter' is from isilon and 'missing globalID in protection group' is from powerstore.
Are we able to determine what is causing the errors for the PowerStore or PowerScale for the above errors?
/sync
link: 28173
Hi Team any update on this issue?
Hi @anandhg02, this will be taken as a feature for implementation in our roadmap. Closing this issue for now as we have updated the respective documentation. Thanks.
Hi @khareRajshree, Noted on the roadmap. Can you confirm if installing multiple CSI drivers (PMAX/PSTR/PSCALE) in the same OCP cluster is supported or not? I don't see any document that say only one CSI driver with replication is to be installed per OCP cluster.
https://github.com/dell/csm/issues/1511 has been added to address the issue stated in our roadmap
I had implemented CSM Replication for PowerMAX between 2 OCP clusters. I am using repctl utility for the Replication Failover/Reprotect operations. The replication operations are all working as expected and this can be verified using SRDF. But the ReplicationGroup has the below error in the RG under State field.
Getting the describe output of the RG shows the below. I couldn't find the globalID parameter in the protection group attribute. What is causing this error.
Version Details