epgsql / pgapp

Erlang Postgres application that uses Poolboy and deals with the database being unavailable
MIT License
66 stars 43 forks source link

pgapp:with_transaction() does not work as expected #22

Open lukyanov opened 7 years ago

lukyanov commented 7 years ago

The README file contains this example:

    pgapp:with_transaction(fun() ->
                                 pgapp:squery("update ..."),
                                 pgapp:squery("delete from ..."),
                                 pgapp:equery("select ? from ?", ["*", Table])
                           end).

The problem in this approach is that "BEGIN" and "COMMIT" actually happens in one connection, but all the queries inside happens in another.

Look at the following code of pgapp_worker.erl:

equery(PoolName, Sql, Params, Timeout) ->
    middle_man_transaction(PoolName,
                           fun (W) ->
                                   gen_server:call(W, {equery, Sql, Params},
                                                   Timeout)
                           end, Timeout).
...
with_transaction(PoolName, Fun, Timeout) ->
    middle_man_transaction(PoolName,
                           fun (W) ->
                                   gen_server:call(W, {transaction, Fun},
                                                   Timeout)
                           end, Timeout).

middle_man_transaction() will dedicate a separate poolboy worker for each query, including a transaction itself. So when with_transaction() is called, it gets a poolboy worker and runs "BEGIN" in it (inside epsql.erl). Thus the worker is busy waiting for the queries to be executed. But the queries run as separate calls to poolboy which means they get different workers.

lukyanov commented 7 years ago

Ah, now I see. There is a trick:

squery(Sql) ->
    case get(?STATE_VAR) of
        undefined ->
            squery(epgsql_pool, Sql);
        Conn ->
            epgsql:squery(Conn, Sql)
    end.

But the trick only does its work when you do not specify PoolName explicitly. So this works:

    pgapp:with_transaction(fun() ->
                                 pgapp:squery("update ..."),
                                 pgapp:squery("delete from ..."),
                                 pgapp:equery("select ? from ?", ["*", Table])
                           end).

But this doesn't:

    pgapp:with_transaction(pool1, fun() ->
                                 pgapp:squery(pool1, "update ..."),
                                 pgapp:squery(pool1, "delete from ..."),
                                 pgapp:equery(pool1, "select ? from ?", ["*", Table])
                           end).
davidw commented 7 years ago

I think @edhollandAL wrote some of this code - have time to take a look?

lukyanov commented 7 years ago

Just made a fix for the issue. Please have a look.

davidw commented 7 years ago

Thinking out loud: rather than using the process dictionary, is there a way to, say, pass the Conn into the fun() or something?

danilagamma commented 7 years ago

@lukyanov it's intended behaviour, You shouldn't be able to specify Pool (and Timeout) in squery/equery functions that are running inside with_transaction, so yes, in your second example:

    pgapp:with_transaction(pool1, fun() ->
                                 pgapp:squery(pool1, "update ..."),
                                 pgapp:squery(pool1, "delete from ..."),
                                 pgapp:equery(pool1, "select ? from ?", ["*", Table])
                           end).

Functions inside fun() -> ... end will be executed outside of transaction scope.

I agree that README is not comprehensive enough and I suggest that it should be fixed instead of current behaviour.

lukyanov commented 7 years ago

Ok, I see now. But then I would agree with @davidw that passing Conn would be a better approach here. One of the problems in the current approach may be that it is not possible to have nested transactions.

doing_stuff1/0 below would not work as a transaction:

doing_stuff1() ->
    pgapp:with_transaction(pool1, fun() ->
                                 pgapp:squery("update ..."),
                                 doing_stuff2()
                           end).

doing_stuff2() ->
    pgapp:with_transaction(pool1, fun() ->
                                 pgapp:squery("update ..."),
                           end).

You will need explicit communication about the connection you are using.

lukyanov commented 7 years ago

New variant of the changes: https://github.com/epgsql/pgapp/pull/23

edholland commented 7 years ago

@lukyanov: postgres does not have true subtransactions feature - so the example code you show would not work as intended anyway

davidw commented 7 years ago

I like the new code without the process dictionary - it feels cleaner in any case. Other thoughts?

lukyanov commented 7 years ago

I made another PR which is an alternative for #23: https://github.com/epgsql/pgapp/pull/24

@davidw While I agree with you that operating directly with the connection looks clearer and more comprehensible, using kind of 'dirty' approach with process dictionary has the following advantages:

  1. It does not break backwards compatibility of the library (otherwise all who are using pgapp would need to rewrite there code around with_transaction calls when updated).
  2. The usage of with_transaction is more handful when you are not thinking in terms of connections. It hides unnecessary details. Imagine the following code:

    register_user() ->
      pgapp:with_transaction(fun() ->
            users:create_user(),
            stats:update_stats()
      end).

    users:create_user() and stats:update_stats() may be independent functions which, in a different context, are called separately. Here they need to be called together inside a transaction. If you are forced to work with the connection directly, you would need to pass it to those functions, which may break the abstraction you intended to have when creating the functions.

edholland commented 7 years ago

Thanks for the updated PR @lukyanov, I agree that the benefits you cite above are important and should be maintained. The register_user example makes a compelling case for supporting the pseudo-nested transactions. I'm less clear on the use case for supporting the cross-connection transactions.

If we allow this it is possible (likely) a user will attempt to create some sort of distributed transaction mechanism on top of pgapp; without proper two phase commit (which is only available as postgres extension) any such implementation would bound to be unreliable. I'm not sure that's something the project should be willing to support

lukyanov commented 7 years ago

@edholland Thanks for the response! Let me be more clear on the second point which you mention as cross-connection transactions. My point here is not to support complex multi-connection transactions. It is rather to make pgapp aware of the same connection inside the with_transaction callback. When you call with_transaction on the pool pool1 and use pool1 inside the callback, I want pgapp to assume that your intention is to use the same connection.

By the way, the example with register_user is also very good to support my point if we consider the app which uses multiple connection pools. Let me extend the example:

register_user() ->
   pgapp:with_transaction(pool1, fun() ->
         users:create_user(),
         stats:update_stats()
   end).

% module users.erl
create_user() ->
   pgapp:with_transaction(pool1, fun() ->
      pgapp:squery("insert ..."),
      pgapp:squery("update ...")
   end).

% module stats.erl
update_stats() ->
   pgapp:squery(pool1, "update ...").

So, my app uses different databases so I need a connection pool for each database. This means I must specify a pool I want to use every time I do a query or start a transaction. In the example above all the queries are for pool1. Here is the point. users:create_user() and stats:update_stats() are dedicated functions which may be called separately in a different context. So we cannot omit pool1 as an argument in pgapp:with_transaction and pgapp:squery. But here they are called within a single transaction. The equivalent query sequence would look like this:

register_user() ->
   pgapp:with_transaction(pool1, fun() ->
      pgapp:squery("insert ..."),
      pgapp:squery("update ...")
      pgapp:squery(pool1, "update ...").
   end).

In the current pgapp code this approach won't work because the last query contains pool1 as a first argument. pgapp will use a separate poolboy worker and the transaction will only contain first two queries.

P.S.

If we allow this it is possible (likely) a user will attempt to create some sort of distributed transaction mechanism on top of pgapp

We cannot stop the user doing something like this anyway, even in the current pgapp version. Anyone is and was able to specify a pool name inside the callback.

edholland commented 7 years ago

In your examples you're using pool1 for all queries, if this is behaviour required then you can just omit the poolname. Assuming that the inner pool should be pool2, as below, then it's possible that the update on pool2 succeeded while the sql on pool1 is rollback. We currently don't support this behaviour by serialising all calls within a transaction to a single connection

register_user() ->
   pgapp:with_transaction(pool1, fun() ->
      pgapp:squery("insert ..."),
      pgapp:squery("update ...")
      pgapp:squery(pool2, "update ...").
   end).
lukyanov commented 7 years ago

@edholland For the same reason it is ok to allow to specify timeout inside the callback. It must be ignored by pgapp, that's true, but it should not be prohibited as the same function may work differently depending on whether it was called inside the transaction or as standalone.

lukyanov commented 7 years ago

@edholland My example is correct. All queries are using pool1. That is the intention. And I cannot ommit pool1, because update_stats() is a standalone function in a different module.

davidw commented 7 years ago

I'll have to think about this some, but I do think that, given the small number of users, if we need to change the API to improve the code, it's best to do so.

tsloughter commented 7 years ago

Couldn't this issue be simplified by not running the queries in a worker? Instead use the pool only to checkout a connection, use it and return it? This would also help solve #20 .