elixir-ecto / myxql

MySQL 5.5+ driver for Elixir
Apache License 2.0
273 stars 67 forks source link

Return {status, state} on transaction error #162

Closed 0xAX closed 1 year ago

0xAX commented 1 year ago

so db_connection caller will be able to execute rollback for a failed commit operation or disconnect a given connection for a failed begin or rollback operation.

More information in https://github.com/elixir-ecto/db_connection/issues/272

greg-rychlewski commented 1 year ago

I'm pretty sure this is too general. That error has more specific meaning than what is happening here: https://github.com/elixir-ecto/db_connection/blob/v2.0.6/lib/db_connection.ex#L84.

There's a transaction_error function already. Probably it just needs to be called in another place?

edit: Actually sorry I was looking at postgrex for transaction_error. We should probably do the same that it does, so that the error is returned on certain actions? https://github.com/elixir-ecto/postgrex/blob/master/lib/postgrex/protocol.ex#L2022

0xAX commented 1 year ago

hello @greg-rychlewski. I would agree that the error is pretty generic. @josevalim initially suggested to handle only deadlock error (that I initially met during debugging this) but I am not sure... If transaction is failed with any errors I probably prefer to see rollback, at least a try to rollback.

Another possible approach probably to handle driver errors in db_connection instead of just throwing them https://github.com/elixir-ecto/db_connection/blob/master/lib/db_connection.ex#L1642-L1646

Thanks for transaction_error. I didn't know about this function and not familar with postgrex source code.

greg-rychlewski commented 1 year ago

Sorry what I mean is that it looks like you are using DBConnection.TransactionError outside of its scope. This is its scope:

def exception(:idle),
      do: %__MODULE__{status: :idle, message: "transaction is not started"}
    def exception(:transaction),
      do: %__MODULE__{status: :transaction, message: "transaction is already started"}
    def exception(:error),
      do: %__MODULE__{status: :error, message: "transaction is aborted"}
0xAX commented 1 year ago

Sorry what I mean is that it looks like you are using DBConnection.TransactionError outside of its scope. This is its scope:

@greg-rychlewski that is also true :) but it was suggestion of @josevalim. Let's see what he will say

greg-rychlewski commented 1 year ago

I'm a bit skeptical this is needed for all errors inside transactions because otherwise all the drivers would be having a lot of problems. But I need to dig into this more. Maybe someone else will correct me.

0xAX commented 1 year ago

That is ok, I also can't say that I like solution for all errors. I can add check specifically for deadlock as only it bothers me but this solution isn't better than current I assume so let's wait for Jose and other opinions.

If you have a hint how can I do rollback properly and avoid disconnection for the issue described https://github.com/elixir-ecto/db_connection/issues/272 I'd be pretty happy if there is solution exists without adjusting libraries code.

josevalim commented 1 year ago

We can have regular errors inside a transaction (for example, a query is invalid). We should really emit this only if we get that particular status error when committing.

0xAX commented 1 year ago

updated the PR, @josevalim did you mean this or I misunderstood your last comment?

Also still not sure that I've understood the argument about 'regular' errors. If we have multiple queries where last one is invalid within transaction will not we get into the same situation when the connection will be in the transaction state after it will be finished?

greg-rychlewski commented 1 year ago

Also still not sure that I've understood the argument about 'regular' errors. If we have multiple queries where last one is invalid within transaction will not we get into the same situation when the connection will be in the transaction state after it will be finished?

This part was bothering me too. Because it's almost certainly not the case. It made me think your situation happened because a connection was being shared improperly.

0xAX commented 1 year ago

It made me think your situation happened because a connection was being shared improperly.

It still could be the case, but only the places that I've found that change the status of a connection are:

both are not my case. It is interesting that I really see how rollback is called if I change my example that it triggers another error - duplicated entry. Seems it gets into https://github.com/elixir-ecto/db_connection/blame/master/lib/db_connection.ex#L1614 and rolled back as expected. If I understand correctly the crucial difference between these two cases is in the failed case when connection is in 'wrong' transaction state is when commit failed but if something will fail before commit it will be handled as expected, rollback and no disconnection.

If my understanding is correct so yes maybe it worth somehow to check commit only and especially this error and such case could be considered as TransactionError despite current possible messages of TransactionError are really a bit out of scope, maybe it also worth to extend TransactionError with commit operation failed message or so. Anyway it is not enough just return {:error, %DBConnection.TransactionError{...}, state} to have rollback on failed commit.

Or just return 2-size tupple, something like this:

  defp maybe_disconnect(exception, "COMMIT", state) do
    {:error, state}
  end
  defp maybe_disconnect(exception, state) do
    %MyXQL.Error{mysql: %{code: code}} = exception

    if code in state.disconnect_on_error_codes do
      {:disconnect, exception, state}
    else
      {:error, exception, state}
    end
  end

and in this case seems rollback should be handled in the https://github.com/elixir-ecto/db_connection/blob/master/lib/db_connection.ex#L1735, although not sure is there any sense to check for a specific error code in the first clause.

greg-rychlewski commented 1 year ago

Based the code you pointed out (https://github.com/elixir-ecto/db_connection/blob/master/lib/db_connection.ex#L1735) it seems like there is a disconnect between what db connection is expecting and what the drivers are sending back. Both Postgrex and MyXQL return {:error, expection, state} if handle_commit fails. But dbconnection only acts on {:error, _}.

So probably one of the following needs to happen?

1. dbconnection reacts to the 3-tuple 2. the drivers send back a 2 tuple for commit errors

But this i'm not sure about either because of the db connection docs

From the handle_commit callback docs:

Handle committing a transaction. Return {:ok, result, state} on successfully committing transaction, {status, state} to notify caller that the transaction can not commit due to the transaction status status, {:error, exception, state} (deprecated) to error and no longer be inside transaction

It's saying the 3-tuple is deprecated but not saying anything about the 2-tuple. Not sure if that's an oversight or if there is supposed to be a totally different mechanism.

edit: I didn't realize :error is a status. So the docs make it seem like MyXQL and Postgrex need to return a 2-tuple.

greg-rychlewski commented 1 year ago

I think handle_begin and handle_rollback also have this issue where they will return a 3-tuple when they are expected to return a 2-tuple:

handle_begin docs:

Return {:ok, result, state} to continue, {status, state} to notify caller that the transaction can not begin due to the transaction status status, {:error, exception, state} (deprecated) to error without beginning the transaction

handle_rollback docs:

Handle rolling back a transaction. Return {:ok, result, state} on successfully rolling back transaction, {status, state} to notify caller that the transaction can not rollback due to the transaction status status, {:error, exception, state} (deprecated) to error and no longer be inside transaction

greg-rychlewski commented 1 year ago

I think for both drivers we can simply modify handle_transaction to convert the 3-tuples to 2-tuples, before returning to handle_begin/ handle_commit /handle_rollback. I think that's cleaner than passing the statement type down the error path.

0xAX commented 1 year ago

@greg-rychlewski

3-size tuples are actually also handled in the DBConnection.handle_common_result/3 although it does not do rollback which is probably ok in a case of failed begin or rollback?

If we'll return {:error, state} on failed commit it will cause rollback, I've just tested:

diff --git a/lib/myxql/connection.ex b/lib/myxql/connection.ex
index 86bd590..6e5565d 100644
--- a/lib/myxql/connection.ex
+++ b/lib/myxql/connection.ex
@@ -353,7 +353,7 @@ defmodule MyXQL.Connection do

   defp result({:ok, err_packet() = err_packet}, query, state) do
     exception = error(err_packet, query, state)
-    maybe_disconnect(exception, state)
+    maybe_disconnect(exception, query, state)
   end

   defp result({:error, :multiple_results}, _query, _state) do
@@ -421,7 +421,10 @@ defmodule MyXQL.Connection do
     end
   end

-  defp maybe_disconnect(exception, state) do
+  defp maybe_disconnect(exception, "COMMIT", state) do
+    {:error, state}
+  end
+  defp maybe_disconnect(exception, _query, state) do
     %MyXQL.Error{mysql: %{code: code}} = exception

works as expected, it gets into rollback and disconnect is avoided. Of course maybe it worth to check error code and trigger disconnect if the error code is from disconnect_on_error_codes but in general {status, state} works.

In a case of {status, state} result of handle_begin or handle_rollback it will cause disconnect according to the db_connection code but I am wondering is it expected behaviour all the time from user perspective? Maybe yes... can't judge

greg-rychlewski commented 1 year ago

3-size tuples are actually also handled in the DBConnection.handle_common_result/3 although it does not do rollback which is probably ok in a case of failed begin or rollback?

If we go by the callbacks then handle_common_result is only there to handle {:ok, _, _} and {:disconnect, _, _} for handle_begin, handle_commit and handle_rollback. Take a look at the specs and the mention of the 3 tuple being deprecated: https://hexdocs.pm/db_connection/DBConnection.html#c:handle_begin/2. Same for commit/rollback.

It seems 100% intentional to disconnect since :error is explicitly listed here

case Holder.handle(pool_ref, :handle_begin, [], opts) do
      {status, _conn_state} when status in [:idle, :transaction, :error] ->
        status_disconnect(conn, status, meter)

Probably errors during BEGIN or ROLLBACK are considered unrecoverable and the connection is disconnected to avoid a dirty socket. But @josevalim might be able to convince you better than me.

edit:

Of course maybe it worth to check error code and trigger disconnect if the error code is from disconnect_on_error_codes but in general

You will always handle this case if the change is instead in handle_transaction. All it needs to do is convert the 3-tuple error to 2-tuple after all this other stuff happens.

josevalim commented 1 year ago

Probably errors during BEGIN or ROLLBACK are considered unrecoverable and the connection is disconnected to avoid a dirty socket. But @josevalim might be able to convince you better than me.

This is definitely the safest bet unless we diligently verify that a given operation will indeed abort the transaction. For example, we are 100% that a particular error code on commit means the transaction was aborted. I am not sure if we support keeping the transaction open in DBConnection if COMMIT fails and the DB leaves it open.

0xAX commented 1 year ago

@greg-rychlewski @josevalim yes thank you.

It seems so. No need to convince me I am ok with this, just mostly was wondering why it is done in this way and I don't have good arguments why it should be recoverable like commit. I am pretty to ok to leave disconnect on failed begin and rollback as in a case of failed begin well I can't imagine how it could be but there should be something definitely wrong between an app and db. For the case of failed rollback well we tried, we can't so yes better to disconnect to not leave 'garbage' and data/socket in wrong state. So yes disconnect probably is the most safest way here.

For the failed commit (at least one case is described in the db_connection issue) I suppose it still worth at least to try rollback (and disconnect if it will fail). I have adjusted PR could please take a look is it ok now or I've missed something? I skipped tests for now and did it only in my testing environment as I am not sure how to simulate failed begin or commit and would appreciate if somebody could give me any hints.

sorry but not sure that I can provide adjustment for postgrex as I don't have environment for testing it right now under my hands.

greg-rychlewski commented 1 year ago

sorry but not sure that I can provide adjustment for postgrex as I don't have environment for testing it right now under my hands.

No worries I will do that one. Thank you for making this change.

greg-rychlewski commented 1 year ago

Thanks @0xAX this looks good to me. I'll just give @josevalim a chance to stop me from merging before pressing the button.

0xAX commented 1 year ago

@josevalim @greg-rychlewski thank you both for the help you provided!