DarkWanderer / ClickHouse.Client

.NET client for ClickHouse
MIT License
313 stars 62 forks source link

`Session is locked by a concurrent client` from bulk copy #152

Open MaceWindu opened 2 years ago

MaceWindu commented 2 years ago

I'm not sure if it is upstream issue in ClickHouse or some races in provider code, so I will fill it here for now.

Following code fails with Session is locked by a concurrent client error for me after ~200 iterations, which is quite annoying as it fails test run at random locations...

CH version: 22.6.3.35 Sessions: enabled

            using var cn = new ClickHouseConnection(cs);
            cn.Open();
            using var cmd = cn.CreateCommand();
            cmd.CommandText = "drop table if exists test_table";
            cmd.ExecuteNonQuery();
            cmd.CommandText = "create table test_table(field Int32) Engine=Memory";
            cmd.ExecuteNonQuery();

            var cnt = 0;
            var values = new object[1][] { new object[1] };
            while (true)
            {
                using var bc = new ClickHouseBulkCopy(cn);
                bc.MaxDegreeOfParallelism = 1;
                bc.DestinationTableName = "test_table";
                values[0][0] = cnt;
                await bc.WriteToServerAsync(values);
                cnt++;
            }
MaceWindu commented 2 years ago

Also filled issue for CH https://github.com/ClickHouse/ClickHouse/issues/39374 as I don't see anything wrong in provider implementation

zhicwu commented 2 years ago

Hi @MaceWindu and @DarkWanderer, I ran into the same issue a while ago when running integration tests using containerized ClickHouse. It might relate to the CI environment as both server and client are running on a single core VM. The workaround I took is retry until timed out(along with an option to turn it off), which seems fixed random test failures.

ClickHouse only supports one query at a time within a session. My understanding is that server may not release resource immediately due to various reasons(e.g. client didn't close response stream timely or server is just too slow to fulfill previous client request etc). Hope this can be of use :)

MaceWindu commented 2 years ago

ClickHouse inserts are synchronous by default. As I can see from client code bulk copy implementation doesn't check operation status (which is done for regular commands).

DarkWanderer commented 1 year ago

Just to give an update. I've figured out the cause for the error - it is because ClickHouse doesn't like it when calls from same session come from different HTTP connections, even if the queries never come concurrently. Workaround is "simple" - just pass a HttpClient with a single-connection HttpClientHandler:

            new HttpClientHandler()
            {
                AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate,
                MaxConnectionsPerServer = 1,
            };

Fixing that in scope of the library is a more complex story, however - I'm thinking how to do that in a fashion which is both reasonably backward-compatible and restorative towards this issue

nike61 commented 1 year ago

@DarkWanderer the problem with that approach is that requests will be sent sequentially with no concurrency at all since we will use only one connection per server. That means no concurrent requests (so one long-running query will block any other request until it finishes ).

It would be great if .NET allows select HTTP connection from connection pool by some condition (for example by a hash of a CH session id) but it seems that there is no way to do that.

Shouldn’t it be fixed on CH side? Maybe it’s just a bug in CH.

MaceWindu commented 1 year ago

@nike61, note that we are talking about session-aware connections here, which already cannot be paralleled

nike61 commented 1 year ago

@MaceWindu yes, I just wanted to point out that suggested parameter "MaxConnectionsPerServer = 1" will make all queries made by same HttpClient sequential. That means that queries will not be sent in parallel, but only one by one.

So if you register HttpClient as singleton in your app (and this is best practice approach) you will make all queries from different sessions run sequentually across all sessions. And it may decrease throughput of your system a lot. ClickHouse client will use same HttpClient for all sessions in case it's registered as a singleton.

DarkWanderer commented 1 year ago

Having a singleton HttpClient stopped being a recommendation after .NET Core 2.1. The current recommended approach - IHttpClientFactory - creates a new HttpClient for each request, in fact. The pooling is done in HttpClientHandler - and this is actually how ClickHouseClient works currently by default, there is a static instance of HttpClientHandler which is shared between instances of ClickHouseConnection

If you create a separate HttpClient for each connection, with its own instance of HttpClientHandler:

var handler = new HttpClientHandler { MaxConnectionsPerServer = 1 };
var client = new HttpClient(handler, /*disposeHandler*/true);
var connection = new ClickHouseConnection("UseSessions=true", client);

you will be able to execute parallel queries in different connections, as expected.

This is the approach I'm taking in #222 - if sessions are enabled, use an "owned"/dedicated HttpClientHandler and HttpClient per connection

nike61 commented 1 year ago

Thank you for your answer!

Official docs states that singleton HttpClient is one of best practice approaches - https://learn.microsoft.com/en-us/dotnet/fundamentals/networking/http/httpclient-guidelines So I can imagine that someone will register HttpClient as singleton and pass it to the library. Anyway even if it's registered using factory all HttpClient will share same connection pool, right? So if I set MaxConnectionsPerServer = 1 using standard ASP approach with factory it will make all queries across all sessions sequential.

(.NET team fixed all DNS related issues in HttpClient so that's why static HttpClient is now one of best practice approach. The only reason to use factory approach is just to easily use such features for example as named clients or Polly.)

Will your solution with separate HttpClient reuse tcp connections for different sessions? I can imagine system that generates many queries. In that case for each query it will create new handler and new connection pool each time. That means that system will not use connection pool at all and will create new connection for each new request. Or maybe I'm missing something.

nike61 commented 1 year ago

@DarkWanderer Sorry, I have one more question. How did you know that ClickHouse doesn't work well with same session from different tcp connection? Did you find an issue in ClickHouse repository? We are working with Altinity so maybe we can ask them to help us but I need more details.

DarkWanderer commented 1 year ago

Thank you for your answer!

Official docs states that singleton HttpClient is one of best practice approaches - https://learn.microsoft.com/en-us/dotnet/fundamentals/networking/http/httpclient-guidelines

Fair enough. "One of" seems to be the critical part here. The documents seem to be a bit self-contradictory here, given how IHttpClientFactory works. Personally I don't think it is best practice - the discussed issue being one of the reasons

So I can imagine that someone will register HttpClient as singleton and pass it to the library. Anyway even if it's registered using factory all HttpClient will share same connection pool, right? So if I set MaxConnectionsPerServer = 1 using standard ASP approach with factory it will make all queries across all sessions sequential.

That was not my advice, however. Of course someone can do that - but I cannot reasonably prevent any and all misuses of advanced capabilities

(.NET team fixed all DNS related issues in HttpClient so that's why static HttpClient is now one of best practice approach. The only reason to use factory approach is just to easily use such features for example as named clients or Polly.)

Will your solution with separate HttpClient reuse tcp connections for different sessions? I can imagine system that generates many queries. In that case for each query it will create new handler and new connection pool each time. That means that system will not use connection pool at all and will create new connection for each new request. Or maybe I'm missing something.

To be honest, what you're saying sounds like just another example of misuse. If one needs multiple, parallel transient connections, why would sessions be needed? Sessions are the exact opposite of transient - it's server-side "single thread" limitation for a long-lived connection. Can you provide a valid use case for transient connection which utilizes sessions feature?

@DarkWanderer Sorry, I have one more question. How did you know that ClickHouse doesn't work well with same session from different tcp connection? Did you find an issue in ClickHouse repository? We are working with Altinity so maybe we can ask them to help us but I need more details.

It's a very simple fact - limiting number of connections to 1 per server fixes the issue. I'm sure it can be reproduced using curl or other tool.

nike61 commented 1 year ago

@DarkWanderer thanks, I should have explained our case more.

We have an ASP application that handles many concurrent users. Each user sends requests to our API. That single API request typically generates 2-4 sequential queries to CH. These 2-4 queries share the same session. Some of these 2-4 queries create temporary tables that should be deleted automatically when session ends.

Our HttpClient is registered using factory.

The main question is how to properly configure library so that our HttpClient doesn't have socket exhaustion and DNS changes problems?

For the use-case of sessions I can also imagine using transaction inside a session. In that case all my 2-4 queries will share same consistent version of data. Transactions for "select" is not yet ready currently but as I understand they will use sessions.

I've found this closed issue on GitHub but it seems that it wasn't fixed: https://github.com/ClickHouse/ClickHouse/issues/4003

DarkWanderer commented 1 year ago

In 6.2.0, session mode will not use connection pool now and will instead use separate HttpConnectionHandler and HttpClient per connection

@nike61 this is a valid use case and I understand your concern about socket exhaustion. The only proper solution I can see at the moment is to introduce library-level connection pooling, which is a significant undertaking I wanted to avoid. If there are any alternative ideas they are very much welcome

joshbartley commented 1 year ago

Recently I looked into the HttpClient and Handler docs and recommendations. The one slightly mentioned, that the Azure client uses, is the SocketsHttpHandler which HttpConnectionHandler uses internally. You wouldn't rely on the consumer to pass in a HttpClient and/or HttpConnectionHandler as you would have an internal one that handles sockets for you based on the endpoint name. You would need a list of all the connection to pool them, and the SocketsHttpHandler can drop a connection after X time as well to handle dns changes. .net 5 has more exensibility for the Sockets handler as well when 3.1 doesn't but similar functionality. I'll look around in the connection code though I don't know of an easy way to test any style of those changes without a complicated container setup. The Sql Connection pool classes have a lot of history in them that makes me think there are some edge cases that only time will show in any connection pool class.