mirromutth / r2dbc-mysql

R2DBC MySQL Implementation
Apache License 2.0
656 stars 100 forks source link

issue when query canceled #73

Closed ddyycao closed 4 years ago

ddyycao commented 4 years ago

latest version of r2dbc-mysql , spring-data-r2dbc and spring-boot

@GetMapping("/test2")
    fun test2(): Mono<Long> {
        // long running query
        val a=db.execute("select userid from xxx order by id limit 1000000,1 ").asType<Long>().fetch().one()
        // wrong query
        val b=db.execute("select userid from xxx order by id limit1 ").asType<Long>().fetch().one()
        return  a.zipWith(b).map { it.t1+it.t2 }
    }

the failed wrong query will cancel the other running query

error log

2019-10-12 16:22:06.299 ERROR 15446 --- [actor-tcp-nio-8] d.m.r.mysql.client.ReactorNettyClient    : Exchange cancelled while exchange is active. This is likely a bug leading to unpredictable outcome.
2019-10-12 16:22:06.317 ERROR 15446 --- [actor-tcp-nio-8] a.w.r.e.AbstractErrorWebExceptionHandler : [cdf61f87] 500 Server Error for HTTP GET "/test2"

and it stop responding after 5 calls

mirromutth commented 4 years ago

The javadoc of zipWith:

An error or empty completion of any source will cause the other source to be cancelled and the resulting Mono to immediately error or complete, respectively.

So if second query has been failed, then first query will be canceled by MonoZip, and have no consumer to drain Result.

I need try drain/consume message packets in QueryFlow instead of Result.

ddyycao commented 4 years ago

I managed to fix this issue by providing a validation-query may be dev.miku.r2dbc.mysql.MySqlConnection#validate should check if there is a running query

mirromutth commented 4 years ago

Hi there,

It should be solved by #74 , and I have published this change to BUILD-SNAPSHOT.

I have created test case for this and have passed the unit test. Could your try again with latest 0.8.0.BUILD-SNAPSHOT? Thanks for your time.

ddyycao commented 4 years ago

awesome!

mirromutth commented 4 years ago

Thanks a lot, its target version is 0.8.0.RC2. If the problem still exists, please reopen this issue.

ddyycao commented 4 years ago

this fix create another issue, the connection with long running query return to connection pool immediately, subsequent request has to wait until the long running query completed. I want to cancel the long running query to protect mysql server Is there any way to actually cancel the long running query on the mysql server instead of waiting it

mirromutth commented 4 years ago

It seems that there is no way to solve this problem.

MySQL protocol seems like keep-alive HTTP/1.1, connection would be unavailable until last exchange completed.

And MySQL protocol has state changing before/after exchange complete, such as transaction flag, auto_commit flag, etc. So if we allow to send a query before last exchange complete, the statuses would be wrong way.

In general, not consuming Result is an issue on its own. It will also happen at #69 , when we query lots of rows but consume little.

About this problem, we maybe need more suggestions.

Excuse me, /cc @mp911de

mp911de commented 4 years ago

Not exactly sure whether I understood the problem:

  1. A Subscription of the query gets canceled
  2. OP wants to cancel the MySQL server-side query when the Subscription gets canceled
  3. Pooling and transaction cleanup play into this

Please let me know whether my understanding of the second issue is correct.

mirromutth commented 4 years ago

Thanks for the reply.

Yes, it is, and seems like pooling has been used. The sample like following:

// Pooling enabled
// PG: SELECT pg_sleep(60); MSSQL: WAITFOR DELAY '00:01'; MySQL: SELECT SLEEP(60);
Mono<Long> a = db.execute("SELECT SLEEP(60)").as(Long.class).fetch().one();
Mono<Long> b = db.execute("SELECT BAD GRAMMAR").as(Long.class).fetch().one();

consumeOrReturn(a.zipWith(b).map(it -> it.getT1() + it.getT2()));

The first query did not end but the connection will be returned to the pool, and the first result has been ignored by zipWith. In this case, zipWith will try to subscribe both but emit error by second result, so it want to cancel first query but the first statement has been sent, so the first Result will be ignored after a long time.

So this is why @ddyycao said:

Is there any way to actually cancel the long running query on the mysql server instead of waiting it

It is not clear whether this is a problem. In my opinion, this is normal behavior and no need to solve (or no way to solve this).

Thanks for your time.

ddyycao commented 4 years ago

Not exactly sure whether I understood the problem:

  1. A Subscription of the query gets canceled
  2. OP wants to cancel the MySQL server-side query when the Subscription gets canceled
  3. Pooling and transaction cleanup play into this

Please let me know whether my understanding of the second issue is correct.

your understanding is correct, and cancel the server-side query it is not supported in jdbc too. It will be awesome to cancel the server-side long running query to protect mysql server. For application server it have to wait every slow query complete, a single slow query may stuck the whole server.

The example I provide is just to trigger a cancelation in my real application the long running query was cancel by webflux timeout.

At least the connection with long runing query should not return to the pool immediately

mp911de commented 4 years ago

Thanks a lot. So @ddyycao please file a new ticket for the cancellation. Maybe KILL QUERY is good enough to do the job. We have similar requests for other drivers but these aren't the highest priority right now.

The second issue deals with the connection being busy. That is, cancel result consumption, then return the connection to the pool. Or on cancellation, a command is sent to the server to release a cursor, while the transaction manager issues a COMMIT.

We made the Postgres and the SQL server driver aware of the busy state as Reactive Streams imposes with the cancellation signal a limitation about synchronization. As soon as a subscription gets canceled, there's no possibility to await completion of the subscription. Everything that happens after cancelation is happening in the background.

The spec says:

A Connection object is not thread-safe (it cannot be shared across multiple threads that concurrently execute statements or change its state). A connection object can be shared across multiple threads that serially run operations by using appropriate synchronization mechanisms.

Cancelation does not allow for synchronization so this would need to happen within the driver. The Postgres protocol allows for a proper correlation of frames to a query command. SQL Server not so much. You might want to look into command queueing in Postgres, the entry point is exchange(…).

A side effect of command queueing is that the client enables pipelining semantics that helps in improving latency if a single connection is used by multiple processes. Pipelining is not an R2DBC requirement, but still nice to have. I suggest filing another ticket instead of reviving this one.