pingcap / tiproxy

Apache License 2.0
56 stars 27 forks source link

Performance gap between TiProxy and HAProxy grows as the dataset size grows #381

Closed djshow832 closed 11 months ago

djshow832 commented 11 months ago

Problem

When running sysbench, the more data TiDB returns, the performance gap between TiProxy and HAProxy grows.

Test Result

  1. Create a TiDB cluster with HAProxy and TiProxy, each of which has 2 CPU cores.
  2. Run sysbench with --tables=10 --table-size=1000000 --threads=32 oltp_read_only --skip_trx=true --point_selects=0 --sum_ranges=0 --order_ranges=0 --distinct_ranges=0 --simple_ranges=1 --range_size={range_size}
  3. Check the QPS and CPU of TiProxy and HAProxy.
Range size QPS HAProxy CPU QPS per 100% CPU
10 32480 120% 27100
100 27462 140% 19600
1000 7420 110% 6740
10000 757 90% 840
Range size QPS TiProxy CPU QPS per 100% CPU
10 30955 180% 17200
100 14655 190% 7710
1000 2112 200% 1060
10000 221 200% 110

As we can see, when the range size is 10, the performance of HAProxy is less than twice of TiProxy. But when the range size is 10000, the performance of HAProxy is almost 8 times of TiProxy.

Reason

In MySQL, each row is wrapped in a MySQL packet.

Thus, TiProxy is more impacted by the row count.

Code Analysis

The flame graph when range size is 1000: image

WritePacket and ReadPacket become the hot path, so this code should be optimized:

xhebox commented 11 months ago
  1. I believe that this could be solved by processing mysql packet instead of mysql packet without header and with only body.

1,3. actor models like gnet typically have one global buffer per connection.

Maybe we should also check results of tracing.

djshow832 commented 11 months ago
  1. I believe that this could be solved by processing mysql packet instead of mysql packet without header and with only body.

How?

1,3. actor models like gnet typically have one global buffer per connection.

Maybe we should also check results of tracing.

I checked it before but it's not easy to fix now.

xhebox commented 11 months ago

How?

Most packets are just forwarded without processing. Packets processed are somewhat special, e.g. they may never use more than one mysql packet to represent, or they are in the handshake process... etc.

That said, we could just forward the original mysql packet as is most of the time, instead of parsing the high level packet.

djshow832 commented 11 months ago

How?

Most packets are just forwarded without processing. Packets processed are somewhat special, e.g. they may never use more than one mysql packet to represent, or they are in the handshake process... etc.

That said, we could just forward the original mysql packet as is most of the time, instead of parsing the high level packet.

But we still need to parse the leader bytes of each packet to know whether there are more packets to return. Once the data is read packet by packet, there's little space to optimize.

xhebox commented 11 months ago

That said, we could just forward the original mysql packet as is most of the time, instead of parsing the high level packet.

But we still need to parse the leader bytes of each packet to know whether there are more packets to return. Once the data is read packet by packet, there's little space to optimize.

Yes. So buffering is needed. We only peek the header and then write the whole buffer... That is, however, easier to implement in actor models than the current models.