Closed shinguz closed 2 months ago
This looks like -1 or an int underflow
wsrep-lib/wsrep-API/v26/wsrep_api.h:#define WSREP_SEQNO_UNDEFINED (-1)
When we search in the code, somwhere here must be a/the bug:
sql/wsrep_sst.cc: wsrep_seqno_t ret_wsrep_seqno = WSREP_SEQNO_UNDEFINED; sql/wsrep_sst.cc: wsrep_seqno_t ret_local_wsrep_seqno = WSREP_SEQNO_UNDEFINED; sql/wsrep_sst.cc: wsrep_seqno_t ret_seqno= WSREP_SEQNO_UNDEFINED; // seqno of complete SST sql/wsrep_mysqld.cc:long long wsrep_cluster_conf_id = WSREP_SEQNO_UNDEFINED; sql/wsrep_mysqld.cc:wsrep_seqno_t local_seqno = WSREP_SEQNO_UNDEFINED; wsrep-lib/wsrep-API/v26/wsrep_api.h: undefined GTID: WSREP_UUID_UNDEFINED:WSREP_SEQNO_UNDEFINED. wsrep-lib/wsrep-API/v26/examples/node/store.c: struct record const record = { WSREP_SEQNO_UNDEFINED, i }; wsrep-lib/wsrep-API/v26/examples/node/store.c: bool const initialization = WSREP_SEQNO_UNDEFINED == store->gtid.seqno && wsrep-lib/wsrep-API/v26/examples/node/wsrep.c: .state_id = {{{ 0, }}, WSREP_SEQNO_UNDEFINED }, wsrep-lib/wsrep-API/v26/examples/listener.c: wsrep_gtid_t state_id = { WSREP_UUID_UNDEFINED, WSREP_SEQNO_UNDEFINED }; wsrep-lib/include/wsrep/provider.hpp: or WSREP_SEQNO_UNDEFINED if the victim was not ordered
And it happens already after the bootstrap before the first join...
conf_id = 0,
conf_id = 1,
And I think earlier it started with 1 and not 0!
Interestingly we found out in this weeks Galera training it does NOT happen always but just sometimes. On by Ubuntu 18.04 with MariaDB 10.6 Galera Cluster and on MySQL 8.0 Galera Cluster I cannot see it right now. Also on the Oracle Linux 8 yesterday with MariaDB 10.6 Galera Cluster we did not see it. But I have seen it already many times in different set-ups. So I have to think about a reproducible test case... Please let me know if you know what is wrong so I do not waste my time.
A fix will be available with the next release
8.0.27-26.9
mysql> show global status like 'wsrep_cluster_conf_id'; +-----------------------+----------------------+ | Variable_name | Value | +-----------------------+----------------------+ | wsrep_cluster_conf_id | 18446744073709551615 | +-----------------------+----------------------+
From the MySQL Error Log we see the correct values: conf_id = 8,
This is sad/critical because cluster_conf_id is the only reliable source where we can see nodes bouncing here and there...