zopefoundation / transaction

Generic transaction implementation for Python.
Other
70 stars 31 forks source link

need a simple api to detect an active transaction #35

Closed mmerickel closed 7 years ago

mmerickel commented 7 years ago

manager.begin() will automatically abort an active transaction. The only way to determine if one is active is via manager._txn is None. This private api is the only way right now to write a function that starts a new transaction only if one does not already exist. Similarly manager.get() will create a transaction if one does not exist.

manager.is_active() maybe?

jimfulton commented 7 years ago

There really isn't such a thing as an active transaction. _txn provided storage for transaction data once there is something to store.

Why do you care?

mmerickel commented 7 years ago

The common technique for handling transactions is with manager: which will kill any "storage for transaction data" that was previously active. There's currently no supported way to have this with be a no-op allowing the caller's with to handle things.

def render_response(request, opts):
    user = opts['user']
    with transaction.manager: # wish this were either a no-op or at least raised an error
                              # we could then rewrite this to check for an active transaction first
        user.id #-> DetachedInstanceError, its transaction was just closed
        return Response()

def handle_request(request):
    with transaction.manager:
        user = DBSession.query(User).get(1)
        render_response(request, dict(user=user))

There is such a thing as an active transaction for anything actually using the transaction package. I realize the package itself doesn't actually manage a physical transaction but it does make it a little harder to support certain use-cases with nested control.

The issue here is that the transaction package does not support nested transaction scopes like in the example above. BUT it does not yell at you for it, it just happily closes the parent transaction by calling manager.begin(). This kind of implicit behavior is not ideal. It should either be a no-op to start a nested transaction, leaving the first manager in control or it should raise forcing you to deal with it.

jimfulton commented 7 years ago

OK, this is an interesting example.

It's not clear why you want the with in render_response. Given that you are just reading, my guess is that you want to make sure someone has started a transaction recently so you see recent updates. But you don't want to require that the caller has started a transaction. Is that so? And if so, why?

jimfulton commented 7 years ago

BTW, I'm not trying to argue really, I'm trying to understand your use case.

mmerickel commented 7 years ago

So in my history of working with transactional databases I've run into and / or wanted to write lots of code in which certain functions may or may not start transactions, allowing them to be reused by other functions that also may or may not want to start transactions. It seemed somewhat common but I've lived for many years now with the transaction package not supporting it. Obviously it's a code smell to have poorly defined division of responsibilities but code hitting the database tends to spider out when you start passing managed objects around.

A more practical example is coming from something like pyramid_tm where if it's in the pipeline it's going to manage the transaction for you. We've run into countless cases in pyramid where people do database operations before or after pyramid_tm gets control of the transaction. This starts an implicit transaction right now and the user is unaware that they're doing anything wrong. The request then hits pyramid_tm which (silently) closes their transaction and starts a new one invalidating whatever work they had done. We'd like to handle this error-prone case in a better way. At the very least I'd say that raising an exception when calling begin() twice in a row without an abort()/commit() would be good but ideally you could also detect the already-open transaction and start using it, turning the pyramid_tm manager into a no-op.

jimfulton commented 7 years ago

I like the idea of making begin/end explicit, at least optionally. There's lots of history of implicit transactions where begin and abort were identical or nearly identical, for better or worse. You provide some good use cases.

I'd like to think about this a bit and probably start a discussion in the python-transactions google group.

jimfulton commented 7 years ago

I think in your nested example from 3 hours ago, you want some way to assert that a transaction has already been started explicitly. Does that sound right?

mmerickel commented 7 years ago

Yes. There are 2 valid use-cases outside of what transaction currently does (implicit abort if begin is called twice), and there isn't a right answer so I just want them to be possible.

  1. If a transaction is already active then raise an error or warning.
  2. If a transaction is already active then start using it instead of calling .begin() and avoid calling commit/abort yourself - leave it up to the caller that started the transaction.

Neither of these changes any assumptions about how transaction works, it's just a question of who gets to begin and commit. It's currently possible via manager._txn is None but I'd just like to see a public api for it.

I'm unaware of that google group but I will join.

jimfulton commented 7 years ago

manager._txn is very much an internal implementation detail with no intended semantics, so it's well worth avoiding.

jimfulton commented 7 years ago

"So in my history of working with transactional databases I've run into and / or wanted to write lots of code in which certain functions may or may not start transactions, allowing them to be reused by other functions that also may or may not want to start transactions. It seemed somewhat common"

Can you give an example or 2 of other systems transaction systems that support this pattern?

jimfulton commented 7 years ago

So thinking about this some more, I think what you really need is to assure that a commit or abort will eventually be called explicitly. It's not the job of library code to decide transaction boundaries, but library code may rely on changes being committed and not being tossed by accident. Does that sound right?

mmerickel commented 7 years ago

This was added via transaction.get() in explicit mode which solves my problems (#43). In implicit mode it's natural to expect that you're always in a transaction and thus making the checks irrelevant.