I'm trying to find a way to sensibly work with transactions using a bolt client
TLDR: Would it be good practise to use a singleton Driver with a scoped BoltGraphClient?
I originally found that when I ran a parallel foreach loop with approx 10k requests inserting 5 nodes each that a singleton bolt client would produce null reference exceptions when using transactions
Thinking about this I suspect that it's because it's difficult to translate a transaction scope from a single client to many concurrent scoped requests
So I approached the DI a different way after looking at the way you've setup the code and I'm wondering if this makes sense or would be good practise
builder.Services.AddSingleton<IDriver>(config => GraphDatabase.Driver(new Uri("bolt://localhost:7687"), AuthTokens.Basic("<username>", "<password>")));
builder.Services.AddScoped<IGraphClient, BoltGraphClient>(config =>
{
var driver = config.GetRequiredService<IDriver>();
return new BoltGraphClient(driver);
});
I know you'll know all this but I thought I'd explain my thinking
I've approached it like this because as far as I can see every time a client is instantiated through the non-driver based constructors a driver is instantiated from scratch. This follows through to the neo4j-dotnet-client where, when the driver is instantiated, everything including the connection pool is also instantiated every time - therefore if a new driver is created on every new client it will cause inefficiencies
However if we have a singleton driver backing multiple clients then the clients won't create the drivers themselves but the driver will make use of a shared connection pool and sessions. Then we can have a client that manages it's transaction scope on the request which appears to be more reliable
Here is my test unit of work code for saving multiple queries in a single transaction. All the queries were created from the same client in a repository class with the scoped client injected
public async Task<bool> Save(params ICypherFluentQuery[] queries)
{
if (!queries.Any())
return true;
if (!_client.IsConnected)
await _client.ConnectAsync();
using var transaction = _client.Tx.BeginTransaction();
try
{
foreach (var query in queries)
await query.ExecuteWithoutResultsAsync();
if (_random.Next(0, 100) % 3 == 0)
throw new Exception("Fail");
await transaction.CommitAsync();
return true;
}
catch(Exception ex)
{
await transaction.RollbackAsync();
return false;
}
}
I found that with a singleton client with the normal (url, username, password) constructor that this would cause null reference exceptions during concurrent requests. I also found that with the GraphClientFactory with HttpClient I would get random failures when processing 10k requests with 5 queries each in a parallel foreach with a not found exception where the bookmark reference couldn't be found. I suspect due to a timeout from creating the http client on every client which happens due to a default parameter
Transactions worked well with a scoped client but were quite slow as you would expect due to the extra work they'd do creating the driver each time
When I tried this approach of using a singleton driver with scoped clients I found the following approximate improvements over a scoped client (in a fairly unscientific conditions test)
17% (ish) reduction in total request time
17% (ish) reduction in average request time
19% (ish) reduction in max request time
64% (ish) reduction in min request time
30% (ish) reduction in memory usage
So basically I'm asking does this seem like a sensible approach? If so I'd be happy to create a PR to update documentation or create an extension method for setting up clients in this way
Hi,
I'm trying to find a way to sensibly work with transactions using a bolt client
TLDR: Would it be good practise to use a singleton Driver with a scoped BoltGraphClient?
I originally found that when I ran a parallel foreach loop with approx 10k requests inserting 5 nodes each that a singleton bolt client would produce null reference exceptions when using transactions
Thinking about this I suspect that it's because it's difficult to translate a transaction scope from a single client to many concurrent scoped requests
So I approached the DI a different way after looking at the way you've setup the code and I'm wondering if this makes sense or would be good practise
I know you'll know all this but I thought I'd explain my thinking
I've approached it like this because as far as I can see every time a client is instantiated through the non-driver based constructors a driver is instantiated from scratch. This follows through to the neo4j-dotnet-client where, when the driver is instantiated, everything including the connection pool is also instantiated every time - therefore if a new driver is created on every new client it will cause inefficiencies
However if we have a singleton driver backing multiple clients then the clients won't create the drivers themselves but the driver will make use of a shared connection pool and sessions. Then we can have a client that manages it's transaction scope on the request which appears to be more reliable
Here is my test unit of work code for saving multiple queries in a single transaction. All the queries were created from the same client in a repository class with the scoped client injected
I found that with a singleton client with the normal
(url, username, password)
constructor that this would cause null reference exceptions during concurrent requests. I also found that with the GraphClientFactory with HttpClient I would get random failures when processing 10k requests with 5 queries each in a parallel foreach with a not found exception where the bookmark reference couldn't be found. I suspect due to a timeout from creating the http client on every client which happens due to a default parameterTransactions worked well with a scoped client but were quite slow as you would expect due to the extra work they'd do creating the driver each time
When I tried this approach of using a singleton driver with scoped clients I found the following approximate improvements over a scoped client (in a fairly unscientific conditions test)
So basically I'm asking does this seem like a sensible approach? If so I'd be happy to create a PR to update documentation or create an extension method for setting up clients in this way