sysown / proxysql

High-performance MySQL proxy with a GPL license.
http://www.proxysql.com
GNU General Public License v3.0
6.03k stars 981 forks source link

SET NAMES is sent to master #1422

Closed 7c closed 5 years ago

7c commented 6 years ago

Hi, i do run v1.4.6 on ubuntu. My setup is classic read-write splitting between 1 master and 5 slaves and all slaves have same weight.

I have read the threads here and what i understood proxysql does handle SET NAMES statements internally and does not send to master. But on my master the log says that they are indeed sent.

+------------+-------------------+-----------+
| count_star | digest_text       | hostgroup |
+------------+-------------------+-----------+
| 7065       | SET NAMES utf8mb4 | 1         |
| 137724     | SET NAMES utf8mb4 | 1         |
| 405280     | SET NAMES utf8    | 1         |
+------------+-------------------+-----------+
==> mysql.log <==
        6011262 Query   SET NAMES utf8mb4
        5796735 Query   SET NAMES utf8mb4
        6402960 Query   SET NAMES utf8mb4
        6337039 Query   SHOW GLOBAL VARIABLES LIKE 'read_only'
        6416231 Query   SHOW GLOBAL VARIABLES LIKE 'read_only'
        6400764 Query   SET NAMES utf8mb4
        6316724 Query   SHOW GLOBAL VARIABLES LIKE 'read_only'

the hostgroup 1 is the master, i have also tried to set query_rule to manage set names to be sent to slaves but without success.

Any recommendations to resolve my issue please?

renecannao commented 6 years ago

proxysql does handle SET NAMES statements internally and does not send to master

Let me explain this better. When a client issues a SET NAMES statement, ProxySQL doesn't not send it to any backend but tracks it internally: up to this point, this is absolutely correct! Although, when ProxySQL needs to execute a query, it ensures that the charset/collation in the backend connection is what the client expects. Therefore, if the client expects utf8mb4 and the backend connection has charset utf8, ProxySQL will executes SET NAMES uff8mb4 before running the query. Note that this doesn't mean that ProxySQL will execute SET NAMES on master, but rather that will execute it on any connection where the charset is not what the client expects.

In Admin you can run this query to determine how many SET NAMES where executed from the clients, and from proxysql:

SELECT * FROM stats_mysql_global WHERE variable_name LIKE '%names%';

Also note that by default proxysql creates connections using the charset defined in global variable mysql-default_charset . The default is utf8, but if your application is using mostly a different charset (for example utf8mb4), changing the value of mysql-default_charset will reduce the number of SET NAMES executed by ProxySQL

7c commented 6 years ago

Hi Rene,

i have made some more tests and have ngrepped whole connection from specific ip(proxysql) towards MASTER and see this at screencast:

https://www.screencast.com/t/XvFDh1Op7

As you see proxysql indeed executes ONLY "SET NAMES" at backend without executing any select nor update commands. Is that expected behaviour? I have stopped the php scripts which run these set commands on proxysql server and queries indeed stop. So my php scripts are sending these commands to 127.0.0.1:6033 (proxysql) and proxysql seems to forward them.

ProxySQLAdmin >SELECT * FROM stats_mysql_global WHERE variable_name LIKE '%names%';
+------------------------+----------------+
| Variable_Name          | Variable_Value |
+------------------------+----------------+
| Com_backend_set_names  | 0              |
| Com_frontend_set_names | 301            |
+------------------------+----------------+
ProxySQLAdmin >select * from global_variables where variable_name like '%char%';
+-----------------------+----------------+
| variable_name         | variable_value |
+-----------------------+----------------+
| mysql-default_charset | utf8           |
+-----------------------+----------------+
renecannao commented 6 years ago

Re-opening, to investigate further. Can you please provide some tcpdump file where all traffic is logged? Client to proxysql, and proxysql to mysql, both directions. Thanks

7c commented 6 years ago

Rene, this is a production environment, so i am not allowed to make a tcpdump, thats why i have started to create an isolated environment already but it might take a while.

In the meantime i have made 1 more tests. I have stopped all the traffic except my ip towards our service/proxysql on localhost and isolated the query which causes this as

R::find('servers')

which uses RedBean PHP and reads all servers. My query rules are basic as:

ProxySQLAdmin >select * from runtime_mysql_query_rules\G
*************************** 1. row ***************************
              rule_id: 1
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: NULL
         match_digest: NULL
        match_pattern: ^SELECT .* FOR UPDATE$
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 1
            cache_ttl: NULL
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 2. row ***************************
              rule_id: 15
               active: 1
             username: shared
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: NULL
         match_digest: NULL
        match_pattern: ^SELECT.*from.*(shared.)?(ipbans|locale)
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 900000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 3. row ***************************
              rule_id: 16
               active: 1
             username: shared
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: 0xA05A02E89D209E9E
         match_digest: NULL
        match_pattern: NULL
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 900000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 4. row ***************************
              rule_id: 17
               active: 1
             username: shared
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: 0xA27FC08C4041A412
         match_digest: NULL
        match_pattern: NULL
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 900000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 5. row ***************************
              rule_id: 18
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: 0xFF1EECBE53A3260F
         match_digest: NULL
        match_pattern: NULL
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 60000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 6. row ***************************
              rule_id: 19
               active: 1
             username: shared
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: NULL
         match_digest: NULL
        match_pattern: ^SELECT.*from.*(shared.)?bancheck
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 300000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
*************************** 7. row ***************************
              rule_id: 100
               active: 1
             username: NULL
           schemaname: NULL
               flagIN: 0
          client_addr: NULL
           proxy_addr: NULL
           proxy_port: NULL
               digest: NULL
         match_digest: NULL
        match_pattern: ^SELECT
 negate_match_pattern: 0
         re_modifiers: CASELESS
              flagOUT: NULL
      replace_pattern: NULL
destination_hostgroup: 2
            cache_ttl: 10000
            reconnect: NULL
              timeout: NULL
              retries: NULL
                delay: NULL
    next_query_flagIN: NULL
       mirror_flagOUT: NULL
     mirror_hostgroup: NULL
            error_msg: NULL
               OK_msg: NULL
          sticky_conn: NULL
            multiplex: NULL
                  log: NULL
                apply: 1
              comment: NULL
7 rows in set (0.00 sec)

From here this query is expanded as "SELECT servers.* FROM servers -- keep-cache" from RedBean (see https://www.screencast.com/t/VrHLACv35R)

But seen at https://www.screencast.com/t/VrHLACv35R this query causes SET NAMES to be sent to MASTER and SELECT sent to SLAVE2. I can visually see whenever i execute test.php it sends SET NAME to master and SELECT to slaveX. This is an isolated test on live system.

Do you have any suggestions what i might be looking at to give you more feedback?

7c commented 6 years ago

I have recorded a full session for you:

https://www.screencast.com/t/du3mRMg04f7j

here you can see how a single select sends SET names to master and select to slaves. default all selects have a cache 10 seconds.

renecannao commented 6 years ago

Since this is an isolated system, can you please share the output of these commands?

SELECT * FROM stats_mysql_connection_pool;
SELECT * FROM stats_mysql_query_digest;
SELECT * FROM stats_mysql_global;
SELECT * FROM global_variables WHERE variable_name NOT LIKE '%password%' AND variable_name NOT LIKE '%credentials%';

Can you also share test.php without credentials?

Thanks

7c commented 6 years ago

Rene,

at last, i could manage to replicate the issue on another set of servers. On first set all the "SET NAMES" went to MASTER because i had default_hostgroup=1 by some of the users. ProxySQL was doing great job and splitting but at the end SET NAMES are gone to another set. I have changed default_hostgroup=2 (slaves) and now SET NAMES are splitted among the slaves. This situation is much better now because master is not busy anymore of course still not expected behaviour.

I could manage to setup a demo infrastructure for you and emailed you the credentials, so you can check it yourself live. I hope this is ok for you, i thought this would save you time.

The demonstration code is:

<?php
require_once __DIR__."/vendor/autoload.php";
class_alias('\RedBeanPHP\R','\R');
R::addDatabase('REMOTE','mysql:host=127.0.0.1;port=6033;dbname=test','sqladmin','xxxxx',true);
R::selectDatabase('REMOTE');
R::debug(TRUE,2);
//R::exec('SET NAMES utf8mb4');
var_dump(R::getDatabaseAdapter()->getDatabase()->getMysqlEncoding());
R::find('servers');
cat composer.json
{
    "require": {
        "gabordemooij/redbean": "^4.3"
    }
}

I do have 10 seconds cache_ttl for generic SELECT. I do run

while true; do sleep 1; php index2.php; done

and run the test every second. Then i see following output:

https://www.screencast.com/t/zBtguXoP

As you see here proxysql does cache the query for 10 seconds and then sends it to one of the slaves. So far so good but the 'SET NAMES' are sent to one of the slaves every second. THIS is (i assume) not expected behaviour.

Thanks for listening.

renecannao commented 6 years ago

For reference to others, this issue has been investigated further. The issue seems that RedBean prefixes SET NAMES with a space, and ProxySQL fails to recognize it is a SET NAMES command.

renecannao commented 5 years ago

This is now fixed in 2.0.x