lpsmith / postgresql-simple

Mid-level client library for accessing PostgreSQL from Haskell
Other
206 stars 71 forks source link

Change withTransactionModeRetry to allow the user to determine when to retry #44

Closed ocharles closed 11 years ago

ocharles commented 12 years ago

withTransactionModeRetry previously would retry if a not_serializable exception was thrown, but this can be a little too fine grained. There are cases when a transaction should be retried when a unique_violation occurs (e.g., a race condition occurs). This allows users to tune the retrying semantics to the needs of their application.


My specific motivation here can be seen at http://dba.stackexchange.com/questions/26905/how-do-i-implement-insert-if-not-found-for-transactions-at-serializable-isolatio. Basically, I use the serializable isolation level, and do handle the unique condition, but the race condition means that the whole transaction must be retried. Instead of copying and pasting your logic, I've made it a little more general. Would love feedback on whether this makes sense, whether we should break the API or enhance it with a new method, and if my coding style is OK :)

lpsmith commented 12 years ago

I don't see any justification for changing the interface of an existing function like this, especially because you've just added significant complications to standard use cases. Though this is an issue that @joeyadams might be interested in.

ocharles commented 12 years ago

I'm happy to add this as a complimentary function that withTransactionModeRetry builds on, but I couldn't think of a name :)

joeyadams commented 12 years ago

Thanks for pinging me on this, @lpsmith.

If you're using the Serializable transaction mode, you don't have to worry about a race condition—that's the whole point. A serializable transaction creates the illusion that your transaction has exclusive access to the database. If another transaction comes along and violates that illusion, one of your queries will fail with a serialization_failure, and you'll have to restart the transaction (which withTransactionRetry does for you).

However, if you're having a performance problem, you could use the ReadCommitted mode and handle the race condition manually, to see if that's any faster. But then you'll need to catch unique_violation instead of serialization_failure.

Since withTransactionModeRetry lets you use a custom transaction mode, it doesn't make much sense for it to only catch serialization_failure. So what I think we should do is:

lpsmith commented 12 years ago

Ok, I'm personally OK with changing the interface if you think it's warranted. I definitely agree we need to add some common helpers.

Although, I wonder if it would make sense to move some of this into a new module, say Simple.Transaction; the helpers should probably be exported from both this module and the new Simple.Errors module in the 0.3 branch.

ocharles commented 12 years ago

@joeyadams you can still get a unique_violation in the serializable isolation level - see my post http://dba.stackexchange.com/questions/26905/how-do-i-implement-insert-if-not-found-for-transactions-at-serializable-isolatio/26925#26925. I use the serializable isolation level, but I still get unique_violation thrown instead of a serialization_failure.

The suggestions for providing these helper functions seems good. Shall I have a look at the giant list of exceptions in PostgreSQL and pick the ones that seem most common?

lpsmith commented 12 years ago

@ocharles, you should take a look at the new Error module @sopvop has written that is in my 0.3 branch.

lpsmith commented 12 years ago

@ocharles, I looked at your stackoverflow link much more carefully, and I have to say I really don't understand what you are getting at. But if Joey thinks the change is good, then I'm down with it.

As I understand your scenario, the number 3 operation should never see the inserted row, regardless of the isolation level of the transaction. And, assuming that your second column is running in its own serializable transaction (I'm not real clear on what you intend here) you don't see any of the side-effects of DML statements within that transaction as you are working at the Repeatable Read level or above... if you read PostgreSQL's documentation again, you'll find that have to wait until the next transaction until your changes become visible to anyone.

At this point, I believe that @joeyadams is correct in his assessment.

lpsmith commented 12 years ago

Gah, I don't know what I was thinking about the Repeatable Read comment; that's wrong so ignore that. In any case, I still don't understand your issue, ocharles.

ocharles commented 12 years ago

@lpsmith my point is that if session A and session B do the same thing, and session A inserts before session B, then session B will block on that insert. When session A commits, session B will immediately throw a unique_violation exception, it won't just carry on. You don't see the effects of other transactions, but if you're going to touch a unique constraint then you will block (and fail in this manner).

lpsmith commented 12 years ago

I see what you are saying now; this behavior surprises me somewhat as it's clearly not entirely "serializable".

As I said, I'd be happy to accept the change, though I think it would be nice to move some of this stuff into a new Transaction module.

lpsmith commented 11 years ago

Ok, I merged this into the 0.3 branch, 06df1a125561bb084bd8415f623782c2d74dee0f