vitessio / vitess

Vitess is a database clustering system for horizontal scaling of MySQL.
http://vitess.io
Apache License 2.0
18.63k stars 2.1k forks source link

Access denied on SHOW ALL SLAVES STATUS using vitess 7.0.1, MariaDB 10.3 on kubernetes #6697

Closed bnu0 closed 1 year ago

bnu0 commented 4 years ago

Hello, I am wondering if anyone has seen something similar:

I am getting occasional (every ~45 seconds) errors on replica tablets that cause them to go to not serving state.

I am running vitess on k8s using the planetscale operator, using Mariadb103Compatible: "vitess/lite:mariadb103" and default init_db.sql, etc. Everything seems to be working fine except that replica tablets are occasionally marking themselves unhealthy for a couple seconds with the following error:

not serving: Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) 
  for this operation (errno 1227) (sqlstate 42000) during query: SHOW ALL SLAVES STATUS

But they quickly mark themselves healthy again:

image

I have confirmed that vt_dba has no issues whatsoever issuing the command:

$ mysql -S /vt/socket/mysql.sock -u vt_dba -e 'SHOW ALL SLAVES STATUS\G' | grep State
               Slave_SQL_State: Slave has read all relay log; waiting for the slave I/O thread to update it
                Slave_IO_State: Waiting for master to send event
       Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

And we see the following in the vttablet logs:

I0910 20:17:02.694655       1 state_manager.go:593] Replication is healthy
W0910 20:17:42.740716       1 rpc_server.go:84] TabletManager.SlaveStatus()(on ams-4193379671 from ) error: Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation (errno 1227) (sqlstate 42000) during query: SHOW ALL SLAVES STATUS
W0910 20:20:01.429449       1 rpc_server.go:84] TabletManager.SlaveStatus()(on ams-4193379671 from ) error: Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation (errno 1227) (sqlstate 42000) during query: SHOW ALL SLAVES STATUS
I0910 20:20:47.719015       1 state_manager.go:582] Going unhealthy due to replication error: Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation (errno 1227) (sqlstate 42000) during query: SHOW ALL SLAVES STATUS
I0910 20:20:52.719560       1 state_manager.go:593] Replication is healthy

And from the vitess code, it certainly looks like the dbaPool is used to issue this command, so i cannot figure out why i would get this error.

Binary version

I am using the vitess/lite:7.0.1 images:

$ /vt/bin/vttablet --version
ERROR: logging before flag.Parse: E0910 19:40:12.182770      26 syslogger.go:122] can't connect to syslog
Version: 19c92a5ea (Git branch 'heads/v7.0.1') built on Tue Aug 25 20:38:42 UTC 2020 by vitess@1299f851bd0c using go1.13.9 linux/amd64

MariaDB is mysql Ver 15.1 Distrib 10.3.24-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Operating system and Environment details

bnu0 commented 4 years ago

Note: this does NOT appear to happen using Mysql56Compatible: "vitess/lite:v7.0.1", instead of mariadb103

bnu0 commented 4 years ago

I think i finally figured this out, and it was only happening when the lowest numbered vt_dba connection issues the query (i.e. when that connection from the pool is chosen). KILL-ing that connection to force it to receive new privileges seems to fix it, which seems like a race condition... (my init_db.sql ends with FLUSH PRIVILEGES;)

sougou commented 4 years ago

Nice catch! Thanks for chasing this down. One of us will take a look.

bnu0 commented 4 years ago

If anyone hits this issue and needs a workaround, you can add

KILL CONNECTION USER vt_dba;

to the end of your init_db.sql (after FLUSH PRIVILEGES).

I think what is happening is that vttablet is establishing a vt_dba connection before the init script completes, which is succeeding using the built-in anonymous grants (''@localhost ?) but of course has no access to do anything meaningful with these grants, and the FLUSH does not affect already-connected sessions.

mattlord commented 1 year ago

I'm going to close this for now as it's not clear what issue there is to address here and MariaDB is no longer supported in Vitess 14.0 and later (17.0 the latest GA as of today). If we ever do add MariaDB back as a fully supported database then there will be a lot of work needed to get there and fixing the underlying causes here would be some of them — if they still exist at the time.