Open wrobell opened 7 years ago
Since the asyncpg
does not support Python's DB API 2.0
(for instance, see placeholders), there is no reason to support its functions (similar to syntax sugar).
Psycopg2
internally sends it as a plain DDL and nothing prevents you from the same behavior.
The goal of 2PC is to allow to commit or to rollback transactions in distributed systems if a connection is reset or if your application or even a database fails. So there is no reason to write in a lo-level library a wrapper similar to regular transactions because there are many high-level nuances in managing 2PCs during instance's life cycle (e.g. attempts to COMMIT/ROLLBACK PREPARED
2PC transactions from a local WAL-log at the start of an application -- see the "Notes" part in the documentation[1]).
[1] https://www.postgresql.org/docs/current/static/sql-commit-prepared.html
IMHO, it is matter of allowing transaction context manager and Transaction
class to accept custom id plus some if statements in the code of the class (see also parts of the code related to the save point feature aka nested transactions).
Could you, please, provide a snippet with 2PC and the Transaction
class?
Just to clarify — which DDLs a context manager should send to the Postgres
in magic methods?
(I bet you know how 2PC works and you have a transaction manager now to be interacted with).
Please consider the following
async with conn.transaction(id='tx1'): # or prepared='tx1'?
conn.execute(...)
# on success: prepare transaction 'tx1'
# on error: rollback
# NOTE: commit is not executed
Also allow
await conn.commit(id='tx1') # executes: commit prepared 'tx1'
await conn.rollback(id='tx1') # executes: rollback prepared 'tx1'
Yes, I got the same decision (as a nearly preferred way) while I was writing a big answer several hours ago.
It still seems strange for me because CMs are designed to free (unlock) resources at the exit from a "scope", but it is not true with your proposal.
In your case exit from a CM does not free a resource if a transaction was succeed.
Moreover, an exception ConnectionResetError
before sending PREPARE TRANSACTION
and after it has different meanings: in the first case it is undoubtedly cancels the transaction, in the second case we have to check its state at the server's side.
But outside of the CM your code has no way to check which case it is.
The CM still should be wrapped by a try
/except
block to catch possible exceptions and check which ones have been (possibly) prepared for 2PC and which have been already rolled back.
IMHO it makes a CM be hard to use in real cases "as is" (plus your proposal) because it is neither a manual nor an automatic work with 2PC.
I think the best way is to use a different API to deal with it correctly. Something like that (very first thoughts):
>>> async with conn.Transaction2PC(xact_id='tx1', manager=...) as mgr2pc: # stage (1)
>>> async with mgr2pc.xact(): # stage (2)
>>> conn.execute(...)
>>> # stage (3)
>>> # stage (4)
Notes:
(1) checks "ready for query" state, ability to use 2PCs (fetch the max_prepare_transaction
setting);
(2) knows it is a 2PC transaction and saves the mgr2pc
object
(3) a. on exception: set "rolling back", send rollback (ignore exceptions), set "rolled back" (to prevent changing exception to a ConnectionResetError
);
(3) b.1. calls via mgr2pc
a callback (like "we are ready for the 1st phase" - may be it knows the distributed transaction has already failed)
(3) b.2. sends the PREPARE TRANSACTIONS
to a connection
(3) b.3. if an exception is ConnectionResetError
then set "prepared unknown" state else "rolled back"
(4) a. sends a result of the 1st phase (OK/rolled back/unknown) a transaction manager;
(4) b. waits for a status of a global transaction from it;
(4) c. sends COMMIT PREPARED
/ROLLBACK PREPARED
to a connection;
(4) d. sends result of COMMIT/ROLLBACK PREPARED
to a transaction manager;
Many corner cases are not mentioned above.
In addition there should be an API for a class which instance is passed as a parameter for manager=
.
I would really keep it simple.
max_prepared_transactions
is set to zero, the server will raise an error - no need to check anything.I would really keep it simple.
I would too, but it leads to errors in a wrong time.
no need to check anything.
Check is cheap and can be done once per connection since changing it requires Postgres
restart.
In the other case people can get "prepared transactions are disabled" after long query is run.
Later the value can be passed to a transaction manager to be used in app-level locking at the 3rd stage (to avoid "maximum number of prepared transactions reached")
Do not implement transaction manager functionality
Oops. I did not mean a real transaction manager. Just a class which has an API to be able to interact with an external one. That class can be simple and just implement a local "WAL" for xact_id
s or be complex enough.
But without it a single CM should not be used because after exiting from it a transaction still keeps locks.
It breaks the idea itself of context managers.
This looks like something that can be implemented outside of asyncpg, no?
It would be nice if asyncpg
supported basic API, see https://github.com/MagicStack/asyncpg/issues/86#issuecomment-286390961. The rest, i.e. transaction id generation, communication between entities performing transactions or logic of committing transactions is probably out of scope.
@wrobell We try to keep asyncpg as simple as possible. This feature doesn't seem like something that is going to be used by a significant number of asyncpg users. Moreover, transaction.py
is just 200 lines long and is built with only public API of the Connection
object, meaning that it's possible to implement two-phase commit on top of asyncpg.
You're the first one who's requesting this feature, so let's keep this issue open and see if more people are interested in it.
It would be great if asyncpg supported two-phase commit