Closed vongruenigen closed 1 year ago
We did some more investigation, and it seems that the issue is within the Transaction
class: When running multiple queries concurrently, there will be new connections created within the connectionFor
method because the access to the connectionRegistry
map is not synchronized via a mutex or similar.
We have a quickfix for this particular case (using the async-mutex library), but only for the connectionFor
method, see below. The problem with unsynchronized concurrent access might be a problem in other parts of the library as well, we haven't checked this yet.
async connectionFor(database) {
const release = await connectionMutex.acquire();
try {
if (!this.connectionRegistry.get(database)) {
const databaseRegistry = this.contextHolder.databaseRegistry;
const connectionProvider = databaseRegistry.connectionProvider(database);
if (!connectionProvider) {
throw new DrivineError_1.DrivineError(`There is no database registered with key: ${database}`);
}
const connection = await connectionProvider.connect();
this.connectionRegistry.set(database, connection);
await connection.startTransaction();
}
return this.connectionRegistry.get(database);
} catch {}
finally {
release();
}
}
Thanks for the report and analysis Dirk. I'm sorry that you experienced this issue.
I'll apply the mutex solution on the connectionFor
method straight away and publish a new release.
Following this, let's also consider if there are other places with unsyncrhonized "get or create" of resources might cause issues.
Thanks for swift reaction! And no worries, happens to the best of us ๐
I'll do some digging over the weekend and next week as well to figure out if there are other places there concurrency might lead to issues.
Another possible solution for this particular case we thought about is that the library could use one connection per query, i.e. using an array of connections rather than a single one in the connectionRegistry
. This might lead to other issues though (e.g. too many connections used at the same time).
@vongruenigen For transactions to behave as expected its necessary for one transaction to have one "connection"
(As you know, a connection is a wrapper around a Neo4j session, in the case of Neo4j. And its the session that provides an interface to begin, commit or rollback transactions and to issue statements in between those events).
I think your mutex locking is ideal and should have negligible performance impact.
--
I believe there might be a similar issue in the non-transactional (without the decorator) version, but not sure where.
However, let's apply fixes separate as they're found.
Sorry that I haven't worked on it yet, but will see if I can work on it tomorrow (GMT+10 : Australia).
Great, thanks a lot for the update, looking forward to the new release! ๐
@jasperblues wanted to quickly follow up and ask when you plan to release the new version? We'd need the fix latest by the end of this week. If you're too busy I could open a PR from my own fork if you'd like.
HI Dirk. I'm sorry sorry that I haven't gotten to this yet.
Yes that would be great if you could open a PR. I will get it merged and a new version published immediately.
Thank you for your very collaborative open-source spirit.
Ok, I'll open a MR later today ๐
Merged the PR, thanks again Dirk.
As discussed we may need to do this for the non-transactional context and check for other places this issue manifests. Will leave the issue open for that.
In the meantime a new version 2.5.0 was published that contains the txn fix.
Great, thanks a lot for publishing the new version! I'll keep it on my list as well to look at the non-transactional and potential other places where this issue might arise.
Stale issue - MR was merged.
Hey there :)
We're using drivine in combination with neo4j for all our backend software at the company I'm working at. We recently started to run into issues with connections not being properly released when using the
@Transactional
decorator and concurrency, i.e. running multiple database queries simultaneously. The problem is that it seems that connections used within a@Transactional
context are not released properly and this leads to a problem after a number of requests, because the connection acquisition times out in the neo4j driver.I've created a minimal example by using the
drivine-inspiration
template and adding some concurrency by running a database query 20 times in theRoutesController
(see line 23-25 in there). The repository can be found here: https://github.com/vongruenigen/drivine-inspiration-connection-release-issueYou can trigger the problem the following way:
/routes/between/:start/:dest
endpoint multiple times one after the other (delay between doesn't really matter, tried it in a quick succession as well with 30s in between the request).It seems that the connections of previous requests are not properly released when using the
@Transactional
decorator and we're not sure what we're doing wrong. Since I used thedrivine-inspiration
starter template for the repository above I assume that this is a general problem in drivine itself and not the way we're using it.Thanks in advance!