go-mysql-org / go-mysql

a powerful mysql toolset with Go
MIT License
4.52k stars 965 forks source link

MySQL client: text vs binary protocol #896

Open vitalyisaev2 opened 1 week ago

vitalyisaev2 commented 1 week ago

I'd like to use go-mysql as a database client for my backend service that interacts with a MySQL server. The performance of go-mysql seems good, but I was wondering about the communication protocol between the driver and the MySQL server. I couldn't find any information about this in the documentation.

As we know, there are two communication protocols in MySQL: the MySQL Command Protocol, also known as the Text Protocol, and the MySQL Binary Protocol. The Text Protocol seems to be enabled by default, as shown in #770 (types like Date or Datetime are passed as strings by default). However, the Binary Protocol is usually considered as more efficient and faste, with smaller messages and less parsing overhead. It could be a good choice for big data applications.

Is it possible to use the Binary Protocol when interacting with the MySQL server using go-mysql?

dveeden commented 1 week ago

Both text and binary results are part of the same protocol. Binary is what is being used for prepared statements.

https://dev.mysql.com/doc/dev/mysql-server/latest/page_protocol_binary_resultset.html

vitalyisaev2 commented 1 week ago

@dveeden, thanks for your response! To make sure I understand you correctly, will I experience a performance boost if I use prepared statements instead of conn.Execute("select id, name from table where id = 1") (as in the example)? In my specific use case, there are no INSERT/UPDATE/DELETE requests, only SELECTs. But the amount of selected data is quite big (usually tens of millions of rows).

dveeden commented 1 week ago

https://github.com/go-mysql-org/go-mysql/blob/3b1dd0e7703815062f7999fd7d503df0061807a9/client/conn.go#L276-L289

So conn.Execute() uses prepared statements if there are multiple arguments and COM_QUERY (text protocol) otherwise.

  1. Personally I would verify what is going over the wire with Wireshark if this is important to you.
  2. A binary protocol might be faster, but prepared statements might need more roundtrips.
  3. Some servers (e.g. TiDB) can re-use plans from prepared statements which can improve performance.
  4. Benchmark all solutions, don't assume A or B is faster.
  5. A wireshark capture might show other inefficiencies (e.g. extra roundtrips during connection), which could be in go-mysql, your code or the server implementation.