snowflakedb / snowflake-connector-net

Snowflake Connector for .NET
Apache License 2.0
176 stars 135 forks source link

SNOW-1663445: How to reduce number of login/session requests? #1025

Open myshon opened 6 days ago

myshon commented 6 days ago

It's question about General Usage, not Feature request.

What is the current behavior?

I use snowflake connector in Asp.Net WebAPI service in low-traffic environment (12k GET queries within a week. Peak 200 requests per hour).

See metrics: image

I see that apart from query-request provider executes additional heavy requests login-request and session under the hood.

Is it possible to reduce number of these requests somehow? Is there any variable in connection string that I can manipulate number of sessions/login (now I use default connection string settings and login via password)

The connection class is used more-less this way - is it correct?

        await using var conn = await _connectionProvider.GetConnection();
        await conn.OpenAsync();
        await using var cmd = CreateCommand(conn, databaseName);
        await using var reader = await cmd.ExecuteReaderAsync();
        while (await reader.ReadAsync())
        {
            yield return ReadData(reader);
        }

        await conn.CloseAsync();

What is the desired behavior?

I would like 1) reduce number of additional heavy requests login-request and session 2) optimise and adjust warehouse configuration to have lower latency on query-request

Configuration

  1. Default connection string settings (login password, min connection pool = 2, max connection pool = 10).
  2. Warehouse "x-small" with multi-cluster disabled
  3. Standard Tables are used
  4. Traffic 12k GET queries within a week. Peak 200 requests per hour.
sfc-gh-dszmolka commented 1 day ago

hi and thanks for raising this question here. On the second point (warehouse optimisation) we might not be able to help here, as it is entirely independent from this particular driver library. But we do have some material on the topic, such as

regarding the first question which is more related to the library here, the login-request calls to Snowflake are necessary to, well, log you in :) and session you seeing is probably calls to session?delete which is when the remote Snowflake session needs to be deleted (e.g. on connection closure and user logout)

I believe what you seeing is closely correlated with the usage pattern mentioned here, which if I'm correct, is:

Maybe you need to discard the session and close the connection after each query but if not, you could consider calling Close() / CloseAsync() only when the session is expected to be closed. You could perhaps also experiment with increasing MinPoolSize to see if keeping more connections in the pool helps in your use-case.

Speaking about which, we have a parameter CLIENT_SESSION_KEEP_ALIVE which when set to true, will issue 'heartbeats' (call to /session/heartbeat) automatically to keep the current connection alive (if you do not explicitly need to dispose of it), instead of logging out of it. An associated other parameter is CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY , ranges between 900 and 3600 seconds (latter is the default, every hour) and governs how often those heartbeats should be sent in a session. Unfortunately i do not see CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY exposed as a driver setting, but you can probably set it on the user with ALT#R USER <username> SET CLIENT_SESSION_KEEP_ALIVE_HEARTBEAT_FREQUENCY = 900 if you wish to set the default 1h heartbeats to 15m heartbeats.

Again, this only makes sense if you do not need to close the session and can keep them around and reused.

Raising another question for me, why does a single login take 5.6s on the average. That might be expected (e.g. if your client sits 'far' away from the Snowflake account, e.g. source is in Europe and SF account is in AWS Tokyo for example), but might be unexpected too. A quick test you could do to instantly eliminate one of the factors, is to set insecuremode=true without changing anything else. If the login-time improves drastically just with this change, then it would indicate the client has some kind of issues getting the responses from the CRL endpoints used for certificate verification. If this is the case, you could take a look at Validating certificates and execute some tests to see if the endpoints are reachable from the source/network you're running the .NET driver from.

Please note insecuremode=true is not meant for everyday usage, especially in production, but as a quick test to rule out a possible factor it can be used. If insecuremode=true doesn't change anything and login-request on the average still takes multiple seconds.

Hope this helps somewhat moving forward.

myshon commented 1 day ago

@sfc-gh-dszmolka Thanks for very comprehensive answer! I know that login-request is required :) I accept that it takes a few seconds. The point was to reduce number of login requests (which is almost 10% of all API calls) by reusing sessions in order reduce total latency.

I will be investigating CLIENT_SESSION_KEEP_ALIVE and insecuremode=true