openark / orchestrator

MySQL replication topology management and HA
Apache License 2.0
5.64k stars 933 forks source link

errant GTID issue on orchestrator GUI after setting up replication from GCP to semi-sync M-S replication #1193

Closed larry7xia80 closed 4 years ago

larry7xia80 commented 4 years ago

Hi @shlomi-noach,

We had an errant GTID issue on orchestrator GUI after setting up replication from GCP to semi-sync M-S replication.

The replication chain is

 (Top Master) GCP --native replication-----> mysql01( master) ---semi-sync--> mysql02(slave)
                                             |--------------------semi-sync--> mysql03(slave).

mysql02/03 are the 2 slaves of mysql01.

We use orchestrator to manage the cluster failover with hook scripts and orchestrator version 3.1.4, mysql version Percona server 5.7.20.

After we mysqlbinlog dump the binary log on mysql01/02/03, It shows the The errant GTID on mysql02/03 are from GCP , mysql01 also has that transaction in binary log.

The errant transx id080557e6-80d0-11ea-a96d-42010af007c2:5000520

#200611 18:08:29 server id 4278602243  end_log_pos 544906396 CRC32 0xfba74e24   GTID    last_committed=885692   sequence_number=885693  rbr_only=no
SET @@SESSION.GTID_NEXT= '080557e6-80d0-11ea-a96d-42010af007c2:5000520'/*!*/; 
\# at 544906396
#200611 18:08:29 server id 4278602243  end_log_pos 544906459 CRC32 0x1ee663b8   Query   thread_id=419180        exec_time=0     error_code=0
SET TIMESTAMP=1591898909/*!*/;
SET @@session.sql_mode=1436549152/*!*/;
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33/*!*/;
BEGIN
/*!*/;
\# at 544906459
#200611 18:08:29 server id 4278602243  end_log_pos 544906523 CRC32 0x611f95ee   Query   thread_id=419180        exec_time=0     error_code=0
SET TIMESTAMP=1591898909/*!*/;
COMMIT
/*!*/;
\# at 544906523
\#200611 18:08:30 server id 130  end_log_pos 544906588 CRC32 0x7b1c851f  GTID    last_committed=885693   sequence_number=885694  rbr_only=yes
======================================================
mysql> SELECT GTID_SUBSET('080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001485, 3a121e8e-93b9-11ea-adec-42010a500075:1-624288, 8ccd116a-93b9-11ea-9e71-42010a500043:1-16006, f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738',
    -> 
    -> '080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001498, 3a121e8e-93b9-11ea-adec-42010a500075:1-624492, 8ccd116a-93b9-11ea-9e71-42010a500043:1-16006, f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738')
    -> 
    ->  AS is_subset; 
+-----------+
| is_subset |
+-----------+
|         1 |
+-----------+
1 row in set (0.00 sec)

` I checked with GTID_SUBSET function by comparing the GTID_executed of mysql02, mysql01, as above. it should not be errant GTID.

I have no idea why the errant GTID shows in Orchestrator?

Best regards,

Thank Larry

shlomi-noach commented 4 years ago
  1. which server this as @@server_uuid: 080557e6-80d0-11ea-a96d-42010af007c2?

  2. Can you please clarify, when you ran GTID_SUBSET, where did you take those values from?

larry7xia80 commented 4 years ago

thank you for your reply.

1. @@server_uuid : 080557e6-80d0-11ea-a96d-42010af007c2 : Google Cloud MySQL  (Top Master)
2.   ----------mysql02-----replica------------
mysql> show global variables like '%GTID%';
+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                    | Value                                                                                                                                                                                             |
+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| binlog_gtid_simple_recovery      | ON                                                                                                                                                                                                |
| enforce_gtid_consistency         | ON                                                                                                                                                                                                |
| gtid_executed                    | 080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001498,
3a121e8e-93b9-11ea-adec-42010a500075:1-624492,
8ccd116a-93b9-11ea-9e71-42010a500043:1-16006,
f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738 |
---------------------------------------------
----------mysql01-----master------------
mysql> show global variables like '%GTID%';
+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Variable_name                    | Value                                                                                                                                                                                             |
+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| binlog_gtid_simple_recovery      | ON                                                                                                                                                                                                |
| enforce_gtid_consistency         | ON                                                                                                                                                                                                |
| gtid_executed                    | 080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001485,
3a121e8e-93b9-11ea-adec-42010a500075:1-624288,
8ccd116a-93b9-11ea-9e71-42010a500043:1-16006,
f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738 |
----------------------------
sorry, i double checked , i should  run SELECT GTID_SUBSET('<gtid_executed_replica>', '<gtid_executed_primary>');   , the result should be 0. [errant]

mysql> SELECT GTID_SUBSET('080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001498, 3a121e8e-93b9-11ea-adec-42010a500075:1-624492, 8ccd116a-93b9-11ea-9e71-42010a500043:1-16006, f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738' ,
    ->      '080557e6-80d0-11ea-a96d-42010af007c2:4923564-5001485, 3a121e8e-93b9-11ea-adec-42010a500075:1-624288, 8ccd116a-93b9-11ea-9e71-42010a500043:1-16006, f0f29823-8be3-11ea-bcb6-42010a50001d:1-7439738')
    ->    AS is_subset;
+-----------+
| is_subset |
+-----------+
|         0 |
+-----------+
1 row in set (0.00 sec)
----------------------------------

i was curious the errant GTID came from Google Could Mysql does exsit on both master [mysql01], and replica [mysql02].

thank you Larry

shlomi-noach commented 4 years ago

so, if I understand correctly, orchestrator reports an errant GTID where the source of the transaction is the top master of the replication tree. Of course that should not happen.

Is orchestrator at all able to access that master? Can you print out orchestrator-client -c topology -alias <yourcluster>?

larry7xia80 commented 4 years ago

@shlomi-noach , much appreciated for your time and effort.

[root@]# /usr/local/orchestrator/resources/bin/orchestrator-client --auth dbadmin    -c topology -alias pcimicro
Enter host password for user 'dbadmin':
prd-pci-##1a-v12-micromysql01:53306   [0s,ok,5.7.20-19-log,rw,ROW,>>,GTID]
+ prd-pci-##1b-v12-micromysql02:53306 [0s,ok,5.7.20-19-log,ro,ROW,>>,GTID]
+ prd-pci-##1f-v12-micromysql03:53306 [0s,ok,5.7.20-19-log,ro,ROW,>>,GTID]
[root@]# 

we added below in /etc/orchestrator.conf.json as client do NOT want to see the Google Cloud MySQL [Top master] 's IP ["1#6.1#8.7#.4"] in orchestrator GUI "DiscoveryIgnoreMasterHostnameFilters": ["1#6.1#8.7#.4"],

is that the reason?

BTW: I replaced some information with # above.

thanks Best regards, Larry

shlomi-noach commented 4 years ago

is that the reason?

Yes. If orchestrator cannot access the actual master then this is expected.

larry7xia80 commented 4 years ago

thanks @shlomi-noach , I got it. orchestrator need to compare the replica's GTID with all the upstream master's GTID, As I "DiscoveryIgnoreMasterHostnameFilters" the Top Master (Google Cloud MySQL). It reports errant GTID.

much appreciated.. thanks again. Larry

shlomi-noach commented 4 years ago

Correct.

We could possibly solve it by checking the master_uuid field in show slave status and suffice with that -- let me think about that a bit.

larry7xia80 commented 4 years ago

thanks @shlomi-noach