nats-io / nats.net.v1

The official C# Client for NATS
Apache License 2.0
646 stars 154 forks source link

Change connection task error handling. #873

Closed scottf closed 6 months ago

scottf commented 6 months ago

Analysis

When a TLS connection fails, in this case the server being connected to is not even on-line, the next connection attempt sometimes freezes. It was intermittent, but reproducible. See code at end.

With some debugging, I could see where it was freezing, but it looked like it was inside the .net library code. What I noticed is that the connect code seemed overly complicated to me, and there was this GC.KeepAlive(t.Exception);. At the end of the day, we don't care why it failed, we just want to move on. So I simplified the code.

There was a corresponding TLS Handshake error in the server log.

[114772] 2024/03/06 15:16:37.763950 [DBG] [::1]:64363 - cid:20 - Starting TLS client connection handshake
[114772] 2024/03/06 15:16:37.763950 [ERR] [::1]:64363 - cid:20 - TLS handshake error: read tcp [::1]:4222->[::1]:64363: wsarecv: An established connection was aborted by the software in your host machine.
[114772] 2024/03/06 15:16:37.764607 [DBG] [::1]:64363 - cid:20 - Client connection closed: TLS Handshake Failure

Since this change, I cannot reproduce the problem and the error is not appearing in the server log.

Code To Reproduce

static void Main(string[] args)
{
    Options opts = ConnectionFactory.GetDefaultOptions();
    opts.Servers = new [] {"tls://localhost:9222", "tls://localhost:4222"};
    opts.NoRandomize = true;
    opts.TlsFirst = true;
    opts.Secure = true;
    opts.TLSRemoteCertificationValidationCallback = (sndr, cert, chain, errs) => true;

    int round = 0;
    while (true)
    {
        Console.WriteLine($"Start Round {++round}");
        try
        {
            using (IConnection c = new ConnectionFactory().CreateConnection(opts, true))
            {
                Console.WriteLine("ServerInfo:" + c.ServerInfo);
            }
        }
        catch (Exception e)
        {
            Console.WriteLine(e.Message);
        }
        Console.WriteLine();
        Thread.Sleep(100);
    }
}
sixlettervariables commented 6 months ago

That code was to mark the exception as observed. Without reading task.Exception you'll end up with an unhandled exception. ~We likely don't want that.~

Edit: I see where the move to Wait will raise this as an aggregate exception instead.