Closed Moortiii closed 8 months ago
Hi @Moortiii,
I understand your proposal, and I think it is quite sensible. Updating the same transaction concurrently can cause some uncertainty due to timing issues, but when done carefully, I can imagine some valid use-cases.
As you pointed out, the _executor.id
is indeed private. While adding a setter would be the easy way out of this, it would potentially allow users to write code like this:
trx = db.begin_transaction()
col1 = trx.collection("col1")
trx._executor._id = another_transaction
col2 = trx.collection("col2")
Not only the transaction ID can easily get lost, thus preventing one from ever accessing the initial transaction again, but the problem can be easily overlooked, as it is hidden in just one line of code. Frankly, I believe even the x-arango-trx-id
setting trick is way better - it may look weird, but it's "loud and clear", there will be no problem figuring out what (and why) you wrote it there.
Following up on what I would consider a reasonable solution
TransactionAPIExecutor
constructor such that it contains a new field, transaction_id
, which can be None
or str
(basically an Optional[str]
). In case it is a str
, the constructor should no longer send a request to /_api/transaction/begin, but set the _id
property directly and check the status()
of the transaction in order to validate it really exists.TransactionDatabase
, which would forward it to the executor. This is straight forward.StandardDatabase
should get a fetch_transaction
method, which takes the transaction ID and returns a TransactionDatabase
. I'm suggesting fetch_transaction
because it implies that a transaction may (or may not) be there, rather than continuing one (which is not necessarily "paused").Testing
Introduce a test case test_transaction.py
, something simple, just to check that we're able to use both the initial transaction and the "continued" object.
def test_transaction_fetch(db, col, docs):
txn_db = db.begin_transaction(write=col.name)
txn_col = txn_db.collection(col.name)
txn_db2 = db.fetch_transaction(txn_db.transaction_id)
# insert some documents using both txn's
# ...
Docs
A small edit in transaction.rst would be great to showcase how fetch_transaction
is supposed to be used.
I'm ready to implement the above. Or, if you want to give it a go, I'm perfectly fine with that, but don't feel pressured, I'm just mentioning since you offered. Let me know how you want to proceed.
These seem like sensible changes that should be straight forward enough to implement. I'll give it a shot later today and report back. Thanks!
I agree with your comment about continue
which could imply the ability to "pause" a transaction. I hadn't thought about it that way, but you're probably right that it would cause some confusion, especially for new users of ArangoDB.
I've opened a PR @apetenchea.
I did consider something like this as well:
request = Request(
method="get",
endpoint=f"/_api/transaction/{transaction_id}",
)
resp = self._conn.send_request(request)
if not resp.is_success:
raise TransactionInitError(resp, request)
result = resp.body["result"]
if result["status"] != "running":
raise TransactionInitError(resp, request)
self._id = transaction_id
My intention was to prevent a user from 'continuing' a transaction that is already committed or aborted, which would be mostly pointless. However, since the response from the API when fetching the status is 200 OK, raising a TransactionInitError
and feeding it the response produced results that would be confusing to the end user. I also realized that perhaps checking the status of a transaction in an external system (that may be committed already) could be useful in some niche cases.
As a sidenote, I noticed that the docs on contributing that are present in the sphinx documentation appears to be outdated. I had to follow the contribution guidelines directly in the repository to get anywhere.
I've come across a case where a transaction needs to be shared across multiple systems. If we wrap the REST API we can easily achieve this by setting the
x-arango-trx-id
header. However, we would like to be able receive transaction IDs on both ends and continue the transaction seamlessly using the python-arango interface, instead of crudely performing raw queries against/_cursor
.I've come up with the following hack, which does work, but given that
_executor
is private, and_executor.id
specifically doesn't have a setter, I'm guessing there may be a reason it's discouraged:Would it make sense to support something like this directly? It seems to me like a reasonable use-case. If so, I'm happy to take a stab at developing a PR for this myself.