Open andbos opened 5 months ago
Is it possible somehow to failover to the standby? In this case the primary instance is beyond rescue and I can't switchover because the primary is unreachable.
If I had a clone I suppose I could have pointed the application to SID of the clone and then created a new standby and configured DG Broker again. In a disaster recovery situation, I mean. Or another option could be to have multiple replicas of primary.
Best regards, Andreas
@andbos right now we only support manual switchover to the standby when all the database are healthy as since DataguardBroker controller is still in the preview release.
We plan to implement failover we need an database observer, which is roadmap item for the next release and will be implemented in v1.2.0
Hi,
Thanks for the update, appreciated. Roughly when could we expect v1.2.0?
Best regards, Andreas
We are still discussing on the timeline for 1.2.0
P.S - if you want to switchover when the primary is down with the current implementation of the DataguardController. You can exec into the standby database and manually run the DGMGRL command for the switchover.
DGMGRL sys@<pwd>
SWITCHOVER TO <standby_database_sid>
Hi,
Thanks for the tip. To start with, I tried executing SWITCHOVER TO
when the primary was up - it worked:
DGMGRL for Linux: Release 21.0.0.0.0 - Production on Mon Jun 24 09:59:33 2024
Version 21.13.0.0.0
Copyright (c) 1982, 2021, Oracle and/or its affiliates. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected to "DB11"
Connected as SYSDBA.
DGMGRL> show configuration
Configuration - dg_config
Protection Mode: MaxAvailability
Members:
db11 - Primary database
db12 - Physical standby database
Fast-Start Failover: Disabled
Configuration Status:
SUCCESS (status updated 27 seconds ago)
DGMGRL> switchover to db12
2024-06-24T09:59:45.339+00:00
Performing switchover NOW, please wait...
2024-06-24T09:59:45.490+00:00
Operation requires a connection to database "db12"
Connecting ...
Connected to "DB12"
Connected as SYSDBA.
2024-06-24T09:59:45.534+00:00
Continuing with the switchover...
2024-06-24T09:59:52.273+00:00
New primary database "db12" is opening...
2024-06-24T09:59:52.273+00:00
Operation requires start up of instance "DB11" on database "db11"
Starting instance "DB11"...
Connected to an idle instance.
ORACLE instance started.
Connected to "DB11"
Database mounted.
Database opened.
Connected to "DB11"
2024-06-24T10:00:10.370+00:00
Switchover succeeded, new primary is "db12"
2024-06-24T10:00:10.373+00:00
Switchover processing complete, broker ready.
DGMGRL> show configuration
Configuration - dg_config
Protection Mode: MaxAvailability
Members:
db12 - Primary database
db11 - Physical standby database
Fast-Start Failover: Disabled
Configuration Status:
SUCCESS (status updated 18 seconds ago)
However, the DataGuardBroker didn't notice that a switchover was done.
$ date
Mon Jun 24 12:02:09 CEST 2024
$ kubectl --kubeconfig ~/.kube/config-sinch-op-smsf-1-andbos -n oracle-database get dataguardbroker
NAME PRIMARY STANDBYS PROTECTION MODE CONNECT STR STATUS
sidb-dgbroker DB11 DB12 MaxAvailability 10.1.1.161:32495/DATAGUARD Healthy
Best regards, Andreas
But when I restarted the new standby (db11) DataGuardBroker reported status Healthy the whole time despite active ORA errors:
DGMGRL> show configuration
Configuration - dg_config
Protection Mode: MaxAvailability
Members:
db12 - Primary database
Error: ORA-16810: multiple errors or warnings detected for the member
db11 - Physical standby database
Error: ORA-16599: Oracle Data Guard broker detected a stale configuration
Fast-Start Failover: Disabled
Configuration Status:
ERROR (status updated 51 seconds ago)
Not even any events:
$ kubectl -n oracle-database describe dataguardbroker
Name: sidb-dgbroker
Namespace: oracle-database
Labels: <none>
Annotations: <none>
API Version: database.oracle.com/v1alpha1
Kind: DataguardBroker
Metadata:
Creation Timestamp: 2024-06-24T09:47:05Z
Finalizers:
database.oracle.com/dataguardbrokerfinalizer
Generation: 1
Resource Version: 133480469
UID: de3110c0-78e2-401a-9f91-c245b8519273
Spec:
Fast Start Fail Over:
Primary Database Ref: sidb11
Protection Mode: MaxAvailability
Set As Primary Database: DB11
Standby Database Refs:
sidb12
Status:
Cluster Connect String: sidb-dgbroker.oracle-database:1521/DATAGUARD
External Connect String: 10.1.1.161:32495/DATAGUARD
Primary Database: DB11
Primary Database Ref: sidb11
Protection Mode: MaxAvailability
Standby Databases: DB12
Status: Healthy
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal DG Configuration up to date 20m DataguardBroker
@andbos that is true the dgbroker would not detect the switchover in this case when we do is manually. This is because since DGBroker controller reconcile has not been triggered.
To detect change in config manually as well we would depend on the database observer which is planned for the next release
Ok, I see. Worse is that if I take down the primary and then attempt switch/failover to the standby the operation won't succeed:
DGMGRL for Linux: Release 21.0.0.0.0 - Production on Mon Jun 24 10:11:09 2024
Version 21.13.0.0.0
Copyright (c) 1982, 2021, Oracle and/or its affiliates. All rights reserved.
Welcome to DGMGRL, type "help" for information.
Connected to "DB12"
Connected as SYSDBA.
DGMGRL> show configuration
Configuration - dg_config
Protection Mode: MaxAvailability
Members:
db11 - Primary database
db12 - Physical standby database
Fast-Start Failover: Disabled
Configuration Status:
SUCCESS (status updated 60 seconds ago)
DGMGRL> show configuration
Configuration - dg_config
Protection Mode: MaxAvailability
Members:
db11 - Primary database
Error: ORA-12541: TNS:no listener
db12 - Physical standby database
Fast-Start Failover: Disabled
Configuration Status:
ERROR (status updated 0 seconds ago)
DGMGRL> failover to db12
ORA-16600: not connected to target standby database
DGMGRL> switchover to db12
2024-06-24T10:14:57.686+00:00
Performing switchover NOW, please wait...
Error: ORA-12541: TNS:no listener
Error: ORA-16625: cannot reach member "db11"
Failed.
2024-06-24T10:14:59.729+00:00
Unable to switchover, primary database is still "db11"
DGMGRL>
Hi,
It seems DG Broker is not able to detect that primary database is down, the status is Healthy all the time.
Standby detected that the primary is down:
But not DG Broker even though the status of the primary is Pending:
Setup: one primary singleinstancedatabase and one standby singleinstancedatabase, both using image enterprise:21.3.0.0. OraOperator version: 1.1.0.
Best regards, Andreas