cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.13k stars 3.81k forks source link

sqlproxy: does not take transaction state into account before attempting a session migration #131380

Open fqazi opened 1 month ago

fqazi commented 1 month ago

Currently the sqlproxyccl will attempt to transfer connections when they are in a safe state, which is determine by this logic: https://github.com/cockroachdb/cockroach/blob/d926d5eb2621672aa5fe4fed46a1457433e9fc27/pkg/ccl/sqlproxyccl/conn_migration.go#L300. This logic only looks for the ReadyForQuery state, which can happen within a portal.

When we are executing within a portal, not exec statements are allowed causing these transfers to fail at: https://github.com/cockroachdb/cockroach/blob/b82225e86794124ff914c6dac58cece32dde64c9/pkg/sql/pgwire/command_result.go#L655

This happens because Sync commands within a portal will also send by a ReadyForQuery state: https://github.com/cockroachdb/cockroach/blob/b82225e86794124ff914c6dac58cece32dde64c9/pkg/sql/pgwire/command_result.go#L634

To address this, we should either support observer commands within portals, at the very least SHOW TRANSFER STATE or we should modify the sqlproxyccl to be more precise and exclude cases where we are in middle of a txn.

Jira issue: CRDB-42525

jaylim-crl commented 1 month ago

Thanks for the issue.

To elaborate, one of the reasons we designated SHOW TRANSFER STATE as an observer statement was to allow it to execute even if a session is in a failed transaction state. In such cases, any subsequent statements or queries would typically be rejected until the transaction is aborted. By using an observer statement, we ensure that this particular command can run regardless of the transaction's status.

exclude cases where we are in middle of a txn.

To exclude scenarios where we're in the midst of a transaction, we might need to introduce a separate statement (like an observer statement) to check if a transaction is currently active on behalf of the proxy (which is what the SHOW TRANSFER STATE statement implicitly does), or we could reassess the criteria that indicate a safe transfer point:

    // Three conditions when evaluating a safe transfer point:
    //   1. The last message sent to the SQL pod was a Sync(S) or SimpleQuery(Q),
    //      and a ReadyForQuery(Z) has been received after.
    //   2. The last message sent to the SQL pod was a CopyDone(c), and a
    //      ReadyForQuery(Z) has been received after.
    //   3. The last message sent to the SQL pod was a CopyFail(f), and a
    //      ReadyForQuery(Z) has been received after.

I don't think we can exclude the S->Z transition, as that could occur even outside of a transaction.

rafiss commented 1 month ago

To exclude scenarios where we're in the midst of a transaction, we might need to introduce a separate statement (like an observer statement) to check if a transaction is currently active

sqlproxy already should be able to tell if a transaction is currently active. In this line: https://github.com/cockroachdb/cockroach/blob/b82225e86794124ff914c6dac58cece32dde64c9/pkg/sql/pgwire/command_result.go#L634

if the transaction is still open, CRDB will send back a InTxnBlock message (which is the byte T). The different transaction states are here: https://github.com/cockroachdb/cockroach/blob/bfbef5fc39be0f1200a82e1e1d14f9b03bf2c6e7/pkg/sql/conn_io.go#L47-L53

jaylim-crl commented 1 month ago

I looked at the RFC again. Initially, when we implemented connection migration in the proxy, we anticipated that transactions would eventually be migrated using the same mechanism, which is also the other rationale behind the observer statement. If that’s still the goal, then supporting observer statements in the database remains the right approach. Once transaction migration support is in place, no further changes will be needed in the proxy.

Additionally, attempting to parse the ReadyForQuery message would add extra overhead to every query. Currently, the forwarder only checks the message type through the header without trying to interpret the message: https://github.com/cockroachdb/cockroach/blob/53be53d8e0fd5771d924d50b800e9220e6f1def2/pkg/ccl/sqlproxyccl/forwarder.go#L504-L506

I don't believe we should make that change solely to tackle the portal issue, especially given its relatively low usage. The benefits don't seem to justify the performance degradation we would likely experience. Finally, my understanding is that portals function similarly to transactions, and I would expect that we would want to support observer statements for them as well.

rafiss commented 1 month ago

I looked at the RFC again. Initially, when we implemented connection migration in the proxy, we anticipated that transactions would eventually be migrated using the same mechanism, which is also the other rationale behind the observer statement.

Could you point out which section you're referring to? All I see relating to transactions is:

Connection migrations are on a best-effort basis as it is known that not every connection can be transferred (e.g. sessions with temporary tables or active transactions).


Additionally, attempting to parse the ReadyForQuery message would add extra overhead to every query. Currently, the forwarder only checks the message type through the header without trying to interpret the message

How much overhead? It would be reading an additional 5 bytes in the message, and comparing one byte. I wouldn't expect that to be a lot of overhead, but you would know better than me.


I don't believe we should make that change solely to tackle the portal issue, especially given its relatively low usage.

This makes sense to me. For what it's worth, I don't believe there ever were plans to support session migration while there are multiple portals open.


For additional context, I'm posting the details from the spec for portals https://www.postgresql.org/docs/14/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

If successfully created, a named portal object lasts till the end of the current transaction, unless explicitly destroyed. An unnamed portal is destroyed at the end of the transaction, or as soon as the next Bind statement specifying the unnamed portal as destination is issued. (Note that a simple Query message also destroys the unnamed portal.) Named portals must be explicitly closed before they can be redefined by another Bind message, but this is not required for the unnamed portal. Named portals can also be created and accessed at the SQL command level, using DECLARE CURSOR and FETCH.

Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the unnamed portal) and a maximum result-row count (zero meaning “fetch all rows”). The result-row count is only meaningful for portals containing commands that return row sets; in other cases the command is always executed to completion, and the row count is ignored. The possible responses to Execute are the same as those described above for queries issued via simple query protocol, except that Execute doesn't cause ReadyForQuery or RowDescription to be issued.

If Execute terminates before completing the execution of a portal (due to reaching a nonzero result-row count), it will send a PortalSuspended message; the appearance of this message tells the frontend that another Execute should be issued against the same portal to complete the operation. The CommandComplete message indicating completion of the source SQL command is not sent until the portal's execution is completed. Therefore, an Execute phase is always terminated by the appearance of exactly one of these messages: CommandComplete, EmptyQueryResponse (if the portal was created from an empty query string), ErrorResponse, or PortalSuspended.

At completion of each series of extended-query messages, the frontend should issue a Sync message. This parameterless message causes the backend to close the current transaction if it's not inside a BEGIN/COMMIT transaction block (“close” meaning to commit if no error, or roll back if error). Then a ReadyForQuery response is issued. The purpose of Sync is to provide a resynchronization point for error recovery. When an error is detected while processing any extended-query message, the backend issues ErrorResponse, then reads and discards messages until a Sync is reached, then issues ReadyForQuery and returns to normal message processing. (But note that no skipping occurs if an error is detected while processing Sync — this ensures that there is one and only one ReadyForQuery sent for each Sync.)

So maybe another approach that sqlproxy could use is to count CommandComplete messages also.