jstedfast / MailKit

A cross-platform .NET library for IMAP, POP3, and SMTP.
http://www.mimekit.net
MIT License
6.19k stars 821 forks source link

Office365 - Cannot connect #1051

Closed DanielDority closed 4 years ago

DanielDority commented 4 years ago

I've read every piece of documentation out there - we cannot seem to connect consistently to smtp.office365.com over port 587. Sometimes it works, other times it fails at the connect stage. Using the same code leads to inconsistent results and I don't know what else to try.

The connect method will throw a SslHandshakeException which would mean the following:

  1. We used SSL when TLS was expected. Not this.
  2. We didn't specify the correct TLS settings. Not this either.
  3. The certificate chain was invalid. Nope, the code doesn't fire at this point.
  4. The revocation is screwed up. Still didn't work.

Overall it seems like a timeout exception where after the CLIENT sends a "STARTTLS" command, the server responds with a 220 but a secondary "EHLO" was never sent. It's waiting for something but I cannot tell what it is.

My solution was to wrap the connect method in a while loop and with a very low timeout setting until one sticks. But unfortunately I fear this will get us banned for too many connections.

ServiceConfiguration.Instance.Smtp.Host = "smtp.office365.com";
ServiceConfiguration.Instance.Smtp.Port = 587;
using (var client = new SmtpClient())
{
    var defaultTimeout = client.Timeout;
    client.Timeout = 2000;

    while (true)
    {
        try
        {
            await client.ConnectAsync(ServiceConfiguration.Instance.Smtp.Host, ServiceConfiguration.Instance.Smtp.Port, SecureSocketOptions.StartTls);
            break;
        }
        catch
        {
            Console.Write($"Failed at: {DateTime.Now.ToString()}");
        }
    }

    client.Timeout = defaultTimeout;

    await client.AuthenticateAsync(ServiceConfiguration.Instance.Smtp.Account, ServiceConfiguration.Instance.Smtp.Password);

    foreach (var email in pending)
    {
        var message = MessageParser.Parse(email);
        await SendEmail(client, message);
    }

    await client.DisconnectAsync(true);
}

Windows Server 2012 R2 .NET Framework v4.5.1 MailKit v2.8.0.0

Like I said, sometimes it works, sometimes it doesn't.

I added a Protocol Logger to the client and captured this failed result: failure.log

However, other times it works and we get this result: success.log

In the failed cases the ServerCertificateValidationCallback does not fire. The connection times out before it reaches that callback method. In addition, when the connection is successful this method does fire and the SslPolicyErrors value is set to None.

We've also tried to set the CheckCertificateRevocation to false but it had no impact.

Additional context As of today our windows service used to use the System.Net.Mail.SmtpClient provided by the .NET framework. When we started getting dropped connections we dug into an alternative and found out that even MSDN recommended this framework. So I rewrote the service to incorporate MailKit which gave me better indication of what some potential problems are.

Our IT Networking team don't see any issues on our their end about dropped emails. We've sifted through logs and see no correlation. I thought it was a firewall rule but that doesn't make sense as to why some work and some don't. Then I thought McAfee scans but we don't use their firewall services.

I can telnet into the server but after I issue a "starttls" I can no longer interact with the server through command prompt.

One important thing to note is that this only happens on one server. Our DEV/QA/UAT environments share the same NON-PROD credentials. And only our QA server is having this problem. WTF right? I'm not certain what could be causing this but I've hit a wall and need some expert help.

What else can we try?

jstedfast commented 4 years ago

When we started getting dropped connections

MailKit can't solve that type of issue - that's an issue caused by a high-latency internet connection (or at least a high-latency connection to the mail server).

Take a look at the SslHandshakeException's InnerException because that may provide more helpful information. It sounds like, form your description, that the SslStream's AuthenticateAsClient() method is timing out due to high latency with the server.

If that's the case, the only thing you can do is try increasing the client.Timeout value and hope for the best.

Nicky452 commented 4 years ago

ok merci

DanielDority commented 4 years ago

Here's the actual stack trace we receive:

MailKit.Security.SslHandshakeException: An error occurred while attempting to establish an SSL or TLS connection.

This usually means that the SSL certificate presented by the server is not trusted by the system for one or more of the following reasons:

  1. The server is using a self-signed certificate which cannot be verified.
  2. The local system is missing a Root or Intermediate certificate needed to verify the server's certificate.
  3. A Certificate Authority CRL server for one or more of the certificates in the chain is temporarily unavailable.
  4. The certificate presented by the server is expired or invalid.

It is also possible that the set of SSL/TLS protocols supported by the client and server do not match.

See https://github.com/jstedfast/MailKit/blob/master/FAQ.md#SslHandshakeException for possible solutions. ---> System.IO.IOException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at MailKit.Net.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 count) --- End of inner exception stack trace --- at System.Net.Security.SslState.InternalEndProcessAuthentication(LazyAsyncResult lazyResult) at System.Net.Security.SslState.EndProcessAuthentication(IAsyncResult result) at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult) at System.Threading.Tasks.TaskFactory1.FromAsyncCoreLogic(IAsyncResult iar, Func2 endFunction, Action1 endAction, Task1 promise, Boolean requiresSynchronization) --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task) at MailKit.Net.Smtp.SmtpClient.d74.MoveNext() --- End of inner exception stack trace --- at MailKit.Net.Smtp.SmtpClient.d74.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.TaskAwaiter.GetResult() at Ced.CustomerPortal.Services.Emailer.EmailHandler.d__4.MoveNext()

This is with the default timeout set. So if the problem is a high latency then others must be experiencing this exact issue too right? In addition, how would I ping a failed server to know if that is the issue?

DM5PR18CA0060.outlook.office365.com

If this is the server, how would I go about pinging it? I get a host cannot be found if I do: ping DM5PR18CA0060.outlook.office365.com

I don't have access to view: https://status.office365.com/

Any ideas?

jstedfast commented 4 years ago

So if the problem is a high latency then others must be experiencing this exact issue too right?

Depends on if it's a problem with the server's internet connection or yours.

how would I ping a failed server to know if that is the issue?

ping server.com

I get a host cannot be found if I do: ping DM5PR18CA0060.outlook.office365.com

Aaaaaand you found your problem ;-)

DanielDority commented 4 years ago

Are you implying that Microsoft is routing us to servers when we attempt to resolve SMTP from smtp.office365.com to servers that are unreachable?

DanielDority commented 4 years ago

That can't be right. I took the server that was successful and had the same issue when I attempted to ping it from our server.

SA9PR11CA0023.outlook.office365.com

jstedfast commented 4 years ago

No, I'm suggesting that your router has latency issues.

It's possible that they firewalled ping packets, so you won't get a response. My point was that if you can't connect because the connection times out (see the IOException in your log), you can't blame that on MailKit.

MailKit doesn't implement its own TCP/IP stack, it just uses what is already available in .NET.

DanielDority commented 4 years ago

Sure - Let me route this to our IT Networking team to see what they recommend.