sysown / proxysql

High-performance MySQL proxy with a GPL license.
http://www.proxysql.com
GNU General Public License v3.0
5.95k stars 969 forks source link

The parametre mysql-threads cause performance degradation. #3615

Open DBA-Ray opened 3 years ago

DBA-Ray commented 3 years ago

General statistics: total time: 180.0912s total number of events: 80742751

Latency (ms): min: 0.15 avg: 1.18 max: 1726.28 99th percentile: 7.04 sum: 94971062.48

The CPU status,you need notice the system usage rate.

%Cpu(s): 8.6 us, 7.3 sy, 0.0 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st

When mysql-threads=32,the result of pressure test as follows:

SQL statistics: queries performed: read: 66895773 write: 0 other: 0 total: 66895773 transactions: 66895773 (371480.88 per sec.) queries: 66895773 (371480.88 per sec.) ignored errors: 0 (0.00 per sec.) reconnects: 0 (0.00 per sec.)

General statistics: total time: 180.0772s total number of events: 66895773

Latency (ms): min: 0.09 avg: 1.42 max: 237.02 99th percentile: 6.21 sum: 94983105.55

Threads fairness: events (avg/stddev): 126696.5398/28210.97 execution time (avg/stddev): 179.8922/0.03

The CPU status,you need notice the system usage rate. %Cpu(s): 8.8 us, 15.6 sy, 0.0 ni, 74.6 id, 0.0 wa, 0.0 hi, 1.0 si, 0.0 st

In summary,it causes performance degradation and the system usage obviously increase.Why i say the "mysql-threads" to be greater than 16,because i tested a lot with many parameter values. Once it exceeds 16, the performance will decrease.

renecannao commented 3 years ago

Hi Ray,

In order to better understand your workload, what sysbench options are you using?

DBA-Ray commented 3 years ago

Hi sysbench /usr/share/sysbench/tests/include/oltp_legacy/select.lua --oltp-tables-count=50 --oltp-table-size=20000000 --threads=528 --warmup_time=180 --time=180 --report-interval=1 --percentile=99 --histogram=on run

renecannao commented 3 years ago

Thank you Ray.

A couple more questions: a) sysbench/proxysql/mysql are all running on its own dedicated server? (3 servers) b) are you sure that the bottleneck is proxysql? from the output provided so far, proxysql doesn't seem to be fully utilizing the number of cores that you assigned to it.

DBA-Ray commented 3 years ago

You're welcome. a) They are all running on its own dedicated server.Do not share a server. b) Yes.Because every test i only modif the mysql-threads.I analyzed the load of each server and found that only PROXYSQL's CPU SYSTEM usage rate is too high, and it increases with the increase of the mysql-threads.

Initially, I thought that setting the number of CPU cores to maximize performance, but the results surprised me, which is much lower than 16 threads.The number 16 is the default value of the PROXYSQL mysql-threads that I deploy.

renecannao commented 3 years ago

a) ok, perfect.

b) sysbench is creating 528 connections, that means that with mysql-threads=16 , each worker is processing 33 active connections (on average). With mysql-threads=32 , each worker is processing 16.5 active connections (on average).

From the output of sysbench you can see that with mysql-threads=16 you have higher throughput but slightly worse latency. With mysql-threads=32 there is indeed a drop in throughput, but latency is slightly better. I think it all makes sense: with more workers available each worker reacts faster to any network event (thus less latency). But is also performs less work every time a network event happens, thus causing a lot of context switching.

It is even possible that reducing the number of workers can increase throughput further, at the expenses of latency.

Furthermore, I would also test what happens with an increased number of sysbench connections: instead of 528, you can multiply it by 10 and use 5280 connections, and see how mysql-threads=32 compares to mysql-threads=16.

Finally, please make sure you are running benchmark with any power saving / cpu throttling disabled: these combined with kernel scheduling policy can result in very odd benchmark results.

DBA-Ray commented 3 years ago

In the case of multiple concurrency,I found that the CPU user usage of each machine can be relatively high,only PROXYSQL has not improved significantly, but CPU system usage has improved significantly.

In the above stress test, the CPU user utilization rate of MySQL is less than 20%, which has not yet reached the bottleneck.As follows:

%Cpu(s): 16.8 us, 3.9 sy, 0.0 ni, 76.9 id, 0.0 wa, 0.0 hi, 2.3 si, 0.0 st

Memory usage info:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
39542 mysql 20 0 835.4g 329.1g 23532 S 2578 34.1 2455:27 mysqld

There is no pressure on the disk because it is all cached.

avg-cpu: %user %nice %system %iowait %steal %idle 0.07 0.00 0.02 0.00 0.00 99.91

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 0.00 0.43 0.02 0.53 0.34 4.82 18.81 0.00 0.15 0.32 0.14 0.35 0.02 vdb 0.00 0.02 0.01 12.83 1.51 890.89 139.01 0.03 2.70 1.24 2.70 0.15 0.19

avg-cpu: %user %nice %system %iowait %steal %idle 18.00 0.00 6.92 0.00 0.06 75.02

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util vda 0.00 0.00 1.00 0.00 8.00 0.00 16.00 0.00 0.00 0.00 0.00 1.00 0.10 vdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

DBA-Ray commented 3 years ago

Hi I confirmed that the CPU power saving mode is turned off. Using 5280 thread sysbench will report an error.As follows: sysbench 1.0.17 (using system LuaJIT 2.0.4)

Running the test with following options: Number of threads: 5280 Report intermediate results every 2 second(s) Initializing random number generator from current time

Initializing worker threads...

FATAL: /usr/share/sysbench/oltp_point_select.lua:22: module 'oltp_common' not found: no field package.preload['oltp_common'] no file './oltp_common.lua' no file './oltp_common/init.lua' no file './src/lua/oltp_common.lua' no file '/root/.luarocks/share/lua/5.1/oltp_common.lua' no file '/root/.luarocks/share/lua/5.1/oltp_common/init.lua' no file '/root/.luarocks/share/lua/oltp_common.lua' no file '/root/.luarocks/share/lua/oltp_common/init.lua' no file '/usr/local/share/lua/5.1/oltp_common.lua' no file '/usr/share/lua/5.1/oltp_common.lua' no file '/usr/share/sysbench/oltp_common.lua' no file './oltp_common.so' no file '/root/.luarocks/lib/lua/5.1/oltp_common.so' no file '/root/.luarocks/lib/lua/oltp_common.so' no file '/usr/local/lib/lua/5.1/oltp_common.so' no file '/usr/lib/lua/5.1/oltp_common.so' no file '/usr/lib64/sysbench' (last message repeated 1 times)

So I think the core problem is the inability to make full use of multithreading.And the pressure directly on MYSQL has a big gap with the QPS through PROXYSQL. I know that there will be some loss in network forwarding, but what do you think is the loss?

Through PROXYSQL: queries: 81488627 (452512.09 per sec.)

Direct access to MYSQL: queries: 148187205 (822895.79 per sec.)