sysown / proxysql

High-performance MySQL proxy with a GPL license.
http://www.proxysql.com
GNU General Public License v3.0
5.96k stars 972 forks source link

"set sort_buffer_size" is not properly dealt with #1044

Open everpcpc opened 7 years ago

everpcpc commented 7 years ago

If I have a set sort_buffer_size=20000 or some other set statement other than what in lib/mysql_connection.cpp#L1499, user_variable is not set. But however, the is_select_NOT_for_update will be false then, making proxysql changing autocommit to false in the backend connection. Next, this connection will be put back to connection pool because mysql dose not treat this statement as a transaction. Therefore, we have a connection with autocommit=false in the connection pool.

Here the problem comes: If I have enforce_autocommit_on_reads=false, the autocommit will not be changed. The next select statement will start a transaction which is not expected.

Actually, this always happens if I make a connection with an initial statement of set sort_buffer_size=20000;

Maybe related to https://github.com/sysown/proxysql/issues/989 ?

renecannao commented 7 years ago

Thank you for the analysis: it seems pretty correct! I think the whole idea behind enforce_autocommit_on_reads has several limitations, like he one you just hit.

Some possible workaround are:

A generic solution is not that trivial. What solution would work for you? How would you like ProxySQL to handle this situation?

everpcpc commented 7 years ago

cc @MOON-CLJ maybe we could try solution 2 first to see if #989 is resolved ?

MOON-CLJ commented 7 years ago

@everpcpc okay, we can try it。

renecannao commented 7 years ago

Removing set sort_buffer_size is in my opinion the best approach right now, as disabling multiplexing for just that reason is perhaps too much.

MOON-CLJ commented 7 years ago

@renecannao yes, agree. we are just debugging #989

everpcpc commented 7 years ago

We are about to remove set sort_buffer_size once we can draw a conclusion with #989.

Beside, I think perhaps any statements starts with set, not only set sort_buffer_size or what defined in ProcessQueryAndSetStatusFlags, should be treated as user defined variables by default for safety. It's better to disable multiplexing rather than putting a connection with problem into the connection pool.

renecannao commented 7 years ago

We are about to remove set sort_buffer_size once we can draw a conclusion with #989.

Please let me know.

Beside, I think perhaps any statements starts with set, not only set sort_buffer_size or what defined in ProcessQueryAndSetStatusFlags, should be treated as user defined variables by default for safety. It's better to disable multiplexing rather than putting a connection with problem into the connection pool.

Putting a connection with problem in the connection pool is bad: I full agree with this. Disable multiplexing for every SET command is in some case like not having multiplexing at all. In fact, there are a lot of drivers that run set sort_buffer_size or set max_allowed_packet immediately after connecting: if you use such drivers, disabling multiplexing for every SET command will mean that multiplexing is always disabled. If you want to disable multiplexing completely, you could set global variable mysql-multiplexing=false .

In my opinion, what is needed is the ability to be selective choose when multiplexing should be disabled, and how to handle such cases. Issue #1045 can be a solution. Issue #594 can be a solution too , as mysql_query_rules.multiplex can now handle cases when mulltiplex needs to be disabled.

There are a lot of exceptions and corner cases, therefore I really appreciate all the reports so to be able to handle these cases correctly too.

Thanks

MOON-CLJ commented 7 years ago

@renecannao

thx let me explain our problem, we provide a table router like python library for other developer in our company,we can remove "set sort_buffer_size" in our library,but we dont have full control for other developer's code,#1045 can be a solution,but it put such overhead to every query。

so my opinion,safety is more important,disabling multiplexing can be warning by log or discovered by metrics。

everpcpc commented 7 years ago

What directly causing this trouble is that proxysql set autocommit=false for a special statement which backend dose not start a transaction. So I'm thinking about, if handler_again___verify_backend_autocommit can handle, or just skip those statements, then autocommit will be right.

Although there are other problems still existing with those special statements.

This may be a much smaller change.

renecannao commented 7 years ago

@MOON-CLJ : I agree that creating rules for every query can be an overhead. On the other hands, filtering queries can reduce network round trip, so maybe the end result is performance gain and not performance lost.

@everpcpc : I didn't understand what you mean, sorry. Can you please clarify.

MOON-CLJ commented 7 years ago

@renecannao

On the other hands, filtering queries can reduce network round trip, so maybe the end result is performance gain and not performance lost.

but in this situation, most of queries will not be filtered,just very very little "set .." will be filtered。

everpcpc commented 7 years ago

The problem we have here is:

  1. a set sort_buffer_size=20000 comes with enforce_autocommit_on_reads=false;
  2. is_select_NOT_for_update() == false -> proxysql forced change backend's autocommit to false;
  3. (mysql_thread___multiplexing && (myds->myconn->reusable==true) && myds->myconn->IsActiveTransaction()==false && myds->myconn->MultiplexDisabled()==false) == true -> proxysql push the connection with autocommit==false to pool;
  4. a select statement comes, getting that connection with autocommit==false while not changing it. (not expected)

We can either keep the autocommit unchanged (step2), or let the client keep this connection (step3) to handle this case.

renecannao commented 7 years ago

@MOON-CLJ : that's true. but it is also true that the overhead needs to be measured. If implementing #1045 adds an overhead of 10% , is this acceptable? surely no! An overhead of 0.1% : surely acceptable! So it needs to be measured.

@everpcpc : I think I am missing something. A connection in the connection pool with autocommit=false should not be considered a problem if there is no transaction. Does it make sense?

everpcpc commented 7 years ago

But when we set enforce_autocommit_on_reads=false, we are expecting that a select statement should not start a transaction. However, It always dose in this case if a set xxx comes before. Looking like multiplexing is disabled.

renecannao commented 7 years ago

That is right, it always starts a transaction while in reality shouldn't. I think this should be handled by enhancing #955 (not implemented yet), defining that a hostgroup should always have autocommit enabled. Do you still have autocommit=OFF in the database server, as reported in #873 ? If the database servers has autocommit=OFF things become even more complicated.

everpcpc commented 7 years ago

Then maybe it would be better with an option like enforce_autocommit_true_on_reads to ensure autocommit=true for a select not for update ? defining that a hostgroup should always have autocommit enabled requires read/write splitting, which is not that easy to do and has some other problems.

Besides, we do not use autocommit=OFF, #873 is just for testing.

renecannao commented 7 years ago

This sounds a like a reasonable feature request to fix this specific problem!

everpcpc commented 7 years ago

^_^ many thanks~

renecannao commented 7 years ago

enforce_autocommit_true_on_reads: once implemented, this variable should be evaluated only if a transaction is not started yet

renecannao commented 7 years ago

Issue #1045 is solved and available in future release 1.4.1