DotNet4Neo4j / Neo4jClient

.NET client binding for Neo4j
https://www.nuget.org/packages/Neo4jClient
Microsoft Public License
431 stars 146 forks source link

Question for Dependency Injection #471

Open davemanton opened 1 year ago

davemanton commented 1 year ago

Hi,

I'm trying to find a way to sensibly work with transactions using a bolt client

TLDR: Would it be good practise to use a singleton Driver with a scoped BoltGraphClient?

I originally found that when I ran a parallel foreach loop with approx 10k requests inserting 5 nodes each that a singleton bolt client would produce null reference exceptions when using transactions

Thinking about this I suspect that it's because it's difficult to translate a transaction scope from a single client to many concurrent scoped requests

So I approached the DI a different way after looking at the way you've setup the code and I'm wondering if this makes sense or would be good practise

builder.Services.AddSingleton<IDriver>(config => GraphDatabase.Driver(new Uri("bolt://localhost:7687"), AuthTokens.Basic("<username>", "<password>")));
        builder.Services.AddScoped<IGraphClient, BoltGraphClient>(config =>
                                                                  {
                                                                      var driver = config.GetRequiredService<IDriver>();
                                                                      return new BoltGraphClient(driver);
                                                                  });

I know you'll know all this but I thought I'd explain my thinking

I've approached it like this because as far as I can see every time a client is instantiated through the non-driver based constructors a driver is instantiated from scratch. This follows through to the neo4j-dotnet-client where, when the driver is instantiated, everything including the connection pool is also instantiated every time - therefore if a new driver is created on every new client it will cause inefficiencies

However if we have a singleton driver backing multiple clients then the clients won't create the drivers themselves but the driver will make use of a shared connection pool and sessions. Then we can have a client that manages it's transaction scope on the request which appears to be more reliable

Here is my test unit of work code for saving multiple queries in a single transaction. All the queries were created from the same client in a repository class with the scoped client injected

public async Task<bool> Save(params ICypherFluentQuery[] queries)
    {
        if (!queries.Any())
            return true;

        if (!_client.IsConnected)
            await _client.ConnectAsync();

        using var transaction = _client.Tx.BeginTransaction();

        try
        {
            foreach (var query in queries)
                await query.ExecuteWithoutResultsAsync();

            if (_random.Next(0, 100) % 3 == 0)
                throw new Exception("Fail");

            await transaction.CommitAsync();

            return true;
        }
        catch(Exception ex)
        {
            await transaction.RollbackAsync();

            return false;
        }
    }

I found that with a singleton client with the normal (url, username, password) constructor that this would cause null reference exceptions during concurrent requests. I also found that with the GraphClientFactory with HttpClient I would get random failures when processing 10k requests with 5 queries each in a parallel foreach with a not found exception where the bookmark reference couldn't be found. I suspect due to a timeout from creating the http client on every client which happens due to a default parameter

Transactions worked well with a scoped client but were quite slow as you would expect due to the extra work they'd do creating the driver each time

When I tried this approach of using a singleton driver with scoped clients I found the following approximate improvements over a scoped client (in a fairly unscientific conditions test)

So basically I'm asking does this seem like a sensible approach? If so I'd be happy to create a PR to update documentation or create an extension method for setting up clients in this way