dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.11k stars 4.7k forks source link

SmtpClient not sending QUIT on Dispose? #912

Closed mguinness closed 4 years ago

mguinness commented 5 years ago

Per the documentation at SmtpClient.Dispose this method should send QUIT:

Sends a QUIT message to the SMTP server, gracefully ends the TCP connection, and releases all resources used by the current instance of the SmtpClient class.

However when using in .NET Core, this does not appear to be the case.

In the .NET Reference Source, the underlying SmtpTransport class does this: https://github.com/Microsoft/referencesource/blob/master/System/net/System/Net/mail/SmtpTransport.cs#L35-L58

In the .NET Core Libraries, the same code isn't present in SmtpTransport class: https://github.com/dotnet/corefx/blob/master/src/System.Net.Mail/src/System/Net/Mail/SmtpTransport.cs

Was there a reason this code wasn't ported over to .NET Core?

davidsh commented 5 years ago

Was there a reason this code wasn't ported over to .NET Core?

This is a likely a product bug in .NET Core.

nil4 commented 5 years ago

@davidsh would it make sense to implement DisposeAsync on SmtpClient, given that network packets would be sent (and flushed) when disposed? /cc @stephentoub

davidsh commented 5 years ago

There is no SmtpClient.DisposeAsync() method on the API. In fact, this method was added to .NET Framework long before Task async pattern was available. So, it would need to be implemented as a blocking/synchronous API.

nil4 commented 5 years ago

@davidsh sure, but since dotnet/corefx#32665 introduced IAsyncDisposable, could SmtpClient be updated to implement the async disposal API for frameworks that support it?

davidsh commented 5 years ago

Could SmtpClient be updated to implement the async disposal API for frameworks that support it?

Yes, but that would be a separate issue that would need to be opened. And it requires an API change/review process as well.

mguinness commented 5 years ago

Prior to this being worked on, is it possible to change/remove the XML documentation from the SmtpClient.Dispose() method? Right now it's misleading and adds to confusion.

davidsh commented 5 years ago

Prior to this being worked on, is it possible to change/remove the XML documentation from the SmtpClient.Dispose() method? Right now it's misleading and adds to confusion.

Can you be specific about what is wrong with the current documentation? Are you referring to documentation here? Or someplace else?

mguinness commented 5 years ago

From the \

tag used by Intellisense.

image

davidsh commented 5 years ago

From the

tag used by Intellisense.

Are you showing the screenshot from Visual Studio or some other tool?

In general, the documentation is correct. That is how the API should work. And on .NET Framework, the API works correctly. We don't change the documentation for a bug like this. This is a product bug with .NET Core and will be fixed in a future release.

mguinness commented 5 years ago

From Visual Studio. I understand that this is how the API is supposed to work, but if it doesn't and it's a known bug it should be documented/referenced somewhere. Especially if the fix doesn't make it into the 3.0 release. Just my 2¢ worth.

wfurt commented 5 years ago

is there any practical impact of this @mguinness ? If so, could you post simple repro with description?

davidsh commented 5 years ago

is there any practical impact of this @mguinness ?

We need to fix this same as .NET Framework. Without sending the QUIT, the TCP connection to the server is not closed cleanly. So, this results in connections with CLOSE_WAIT and can eat up local ephemeral ports.

wfurt commented 5 years ago

OK. We can look at .NET Framework. Interesting part is when Dispose() is called while still sending email body.

mguinness commented 4 years ago

You can see the ports that are tied up using the following PowerShell command. Is stopping the process the only way to release these?

Get-NetTCPConnection -State CloseWait -RemotePort 587 | Format-Table -Property OwningProcess,LocalPort,CreationTime

Is there any workaround to explicitly close connections before disposing? Is it possible to use the CloseConnectionGroup(string connectionGroupName) method of the ServicePoint property to achieve this?

davidsh commented 4 years ago

Is stopping the process the only way to release these?

Most likely.

Is there any workaround to explicitly close connections before disposing? Is it possible to use the CloseConnectionGroup(string connectionGroupName) method of the ServicePoint property to achieve this?

Are you using an explicit 'connectionGroupName' with the SmtpClient API? If not, then you won't be able to close the connections easily. And some of those .NET Framework ServicePoint related APIs are internal and not public. And I don't think attempting to use reflection is possible/recommended with Windows PowerShell.

mguinness commented 4 years ago

Thanks, I'll try to restart the process to workaround this.

tmds commented 4 years ago

fwiw, I don't like it when classes do things that may potentially block in Dispose/DisposeAsync, like sending data to a peer. I rather have an explicit method for this, that accepts a CancellationToken.

Without sending the QUIT, the TCP connection to the server is not closed cleanly. So, this results in connections with CLOSE_WAIT and can eat up local ephemeral ports.

I'm not sure why this would behave differently when closed with a QUIT message vs only a TCP close.

scalablecory commented 4 years ago

I too would rather Dispose be abortive. DisposeAsync isn't great here either because you can't cancel it.

Can we instead add a void Shutdown() and ValueTask ShutdownAsync(CancellationToken) method that the user could call for graceful close?

Or, @davidsh, do we think misuse (not calling Shutdown) is enough of a problem for SMTP servers that we should force it in Dispose? There's something to be said about doing it in Dispose making everyone's code more "correct" just by upgrading.

davidsh commented 4 years ago

The use of the 'QUIT' protocol command is important because it notifies the server to begin the TCP close of the socket. We got tons of complaints in .NET Framework about socket connections being left in a half-closed state before we implemented sending the 'QUIT' command.

When we implemented this in .NET Framework, we didn't want to add new APIs at the time. So, we simply did it in the Dispose() call. Yes, it is not the best thing to do since it involves async I/O, sending and receiving protocol commands.

In thinking about this some more, we have an opportunity to fix this the right way. The right way is to add an explicit "Shutdown" method. This is very similar to the approach we took in SslStream when we wanted to change the behavior of SslStream.Close/Dispose in order to send the TLS close-notify alert. For SslStream we added an explicit "Shutdown" method.

So, I think we should consider adding a "Shutdown" method to SmptClient() to do the sending of the 'QUIT' command and wait/parse the server's response. However, this opens up a new question about what is the state of the SmptClient after doing this? In most cases, the server will initiate the TCP close of the socket after receiving the 'QUIT' command. The current connection is basically over. So, should we then mark the SmtpClient as 'disposed'? Or invent a new state for it? We need to handle cases where dev call "Shutdown" but maybe try to then call "Send" afterwards.

If we do add this new method, we will need to document this as an intended change from .NET Framework to .NET Core and educate devs to call this new method.

In terms of the API shape for "Shutdown", we could choose to add more a sync style as well as an async style.

scalablecory commented 4 years ago

Yes, I think calling Shutdown() would be equivalent to a QUIT + Dispose(). Calling additional Send would throw an ObjectDisposedException. This is what I had in mind:

class SmtpClient
{
   public void Shutdown();
   public ValueTask ShutdownAsync(CancellationToken cancellationToken);
}

An alternative would be that, upon calling Dispose(), we kick off a background task that sends RSET+QUIT and closes socket, etc. -- that would solve the issue of users not calling Shutdown. Adds complexity but may be worth it.

MarcoRossignoli commented 4 years ago

In thinking about this some more, we have an opportunity to fix this the right way.

I thought that SmtpClient was an "obsolete" class and that add new apis wasn't "planned" https://github.com/dotnet/platform-compat/blob/master/docs/DE0005.md https://www.infoq.com/news/2017/04/MailKit-MimeKit-Official/ For this reason I tried to keep behaviour netfx compat.

BTW if there is new spot for improvement a better workflow could solve the issue. Personally I'm not a great fun of Shutdown name I prefer something more "standard" Close()(but it's too related to classic Close()/Dispose() I/O apis) or maybe Quit() like protocol command. However if Shutdown will be the new "term" for this kind of behaviours I'm ok. I'm with @scalablecory for the compat side, there are tons of code outside that simply using(SmtpClient client = ...) so this should fix all usage with zero code.

Another question what we do if user forgot to call Shutdown or Dispose networkstream finalizer will close stream but won't send quit, maybe we should wrap with a finalizable custom stream and send on finalizer, but this could be worste...block finalization queue on I/O is not a great idea, maybe is less bad waste ports.

scalablecory commented 4 years ago

Personally I'm not a great fun of Shutdown name I prefer something more "standard" Close()(but it's too related to classic Close()/Dispose() I/O apis)

This won't work -- design guidelines state that if Close() & Dispose() both exist, they should have identical behavior.

MarcoRossignoli commented 4 years ago

Actually if we decide to keep all in sync Dispose/Close will do same thing but one is explicit, right? I mean if I call close dispose will be a no-op regarding send quit, but if I won't call close(current code) I'll send quit inside dispose.

davidsh commented 4 years ago

I thought that SmtpClient was an "obsolete" class and that add new apis wasn't "planned"

I think we need more discussion on this to reach a clearer consensus.

With full transparency here, the marking of the SmptClient class as 'obsolete' (in the docs) happened without a full consensus of all the different platform implementations of the class. It's true that the class has languished a bit and needs some improvement to make it more modern/compatible compared with other SMTP stacks.

davidsh commented 4 years ago

Actually if we decide to keep all in sync Dispose/Close will do same thing but one is explicit, right?

You're suggesting we implement both plans? Do a sync 'Shutdown' in Dispose() and provide a new API as well?

MarcoRossignoli commented 4 years ago

You're suggesting we implement both plans? Do a sync 'Shutdown' in Dispose() and provide a new API as well?

Yep as suggested by Cory...old code will benefit with correct quit with zero code, new one will explicity call shutdown(or close o every other name will be chosen).

davidsh commented 4 years ago

An alternative would be that, upon calling Dispose(), we kick off a background task that sends RSET+QUIT and closes socket, etc. -- that would solve the issue of users not calling Shutdown. Adds complexity but may be worth it.

What is "RSET"?

We definitely don't want to call TCP close before the 'QUIT' command is sent to the server and the server's protocol response comes back.

I'm also not a fan of doing a "fire and forget" of an async operation

MarcoRossignoli commented 4 years ago

happened without a full consensus...It's true that the class has languished a bit and needs some improvement to make it more modern/compatible compared with other SMTP stacks.

Totally agree...when a type(used a lot like smtp, every app send email, first of all for simple exception log) comes from "Microsoft" is trusted.

MarcoRossignoli commented 4 years ago

we kick off a background task

I prefer sync blocking stuff in Dispose() is expected by everyone and won't break legacy code written with this in mind.

Well, we can also implement IAsyncDisposable if we want to have sync and true async way

scalablecory commented 4 years ago

An alternative would be that, upon calling Dispose(), we kick off a background task that sends RSET+QUIT and closes socket, etc. -- that would solve the issue of users not calling Shutdown. Adds complexity but may be worth it.

What is "RSET"?

Tells SMTP server to abort the current message being sent. May not be needed here; I am not intimately familiar with SMTP :).

I'm also not a fan of doing a "fire and forget" of an async operation

I'm not a fan of this either, but I'm also not a fan of Dispose doing sync I/O. My worry is that everyone using this class in an async workflow will introduce some scalability-killing synchronous I/O when they upgrade to 5.0 and Dispose starts doing non-trivial work.

davidsh commented 4 years ago

I'm not a fan of this either, but I'm also not a fan of Dispose doing sync I/O. My worry is that everyone using this class in an async workflow will introduce some scalability-killing synchronous I/O when they upgrade to 5.0 and Dispose starts doing non-trivial work.

Which is why we might want to just introduce the new 'Shutdown' APIs and leave Dispose() method as-is in current .NET Core (i.e. it doesn't send the 'QUIT'). While it will be a behavior change from .NET Framework, it will be an improvement which we can document.

MarcoRossignoli commented 4 years ago

Which is why we might want to just introduce the new 'Shutdown' APIs and leave Dispose() method as-is in current .NET Core

This won't solve Cory concern, if we'll add shutdown and remove wrong _transport.ReleaseConnection(); after send/sendasync, legacy code will block inside Dispose() because in that case socket will be open. Today a sendasync won't block because we close stream and dispose tcpclient after send and dispose it's a no-op

Complete after async send Release in sync Send ReleaseConnection inside connection No-op Dispose

davidsh commented 4 years ago

This won't solve Cory concern, if we'll add shutdown and remove wrong _transport.ReleaseConnection(); after send/sendasync, legacy code will block inside Dispose() because in that case socket will be open.

I think we should separate out the desired API shape versus the correct implementation. The "wrong" _transport.ReleaseConnection(); seems like a separate issue of concern regarding the implementation.

tmds commented 4 years ago

Can someone please explain what is the importance of sending the QUIT command? Why does a regular socket close not implicitly have QUIT semantics?

MarcoRossignoli commented 4 years ago

@tmds it's per RFC https://tools.ietf.org/html/rfc5321#section-4.1.1.10 and it's the current correct behaviour of netfx version and seem that in past was a problem We got tons of complaints in .NET Framework about socket connections being left in a half-closed state before we implemented sending the 'QUIT' command. https://github.com/dotnet/corefx/issues/34868#issuecomment-563456997 Well, server implementation can work also without this, but seem no "clean".

tmds commented 4 years ago

per RFC, no "clean".

ok

behaviour of netfx

ok. Cfr above discussion I'm also of the opinion the QUIT should be a separate method, and not in Dispose. That makes it different from netfx.

that in past was a problem We got tons of complaints in .NET Framework about socket connections being left in a half-closed state before we implemented sending the 'QUIT' command.

This doesn't make sense to me.

mguinness commented 4 years ago

Can someone please explain what is the importance of sending the QUIT command?

SmtpClient in .net 2.0 does not close connection to mail server gives some exceptions that can occur.

I'm also of the opinion the QUIT should be a separate method, and not in Dispose.

Could this library mimic the Disconnect method from MailKit? Seems like a better name than Shutdown.

davidsh commented 4 years ago

This doesn't make sense to me.

When a client just closes the TCP connection, the SMTP server doesn't know that right away. It holds the connection open on its side for a while until it decides to close "idle" connections. That could take a while.

In a graceful "close" exchange, the client will send 'QUIT' to the server. The server will then respond with 'OK' back to the client. The server then closes the TCP connection on its side. The client waits for the 'OK' and then closes the connection.

In a very busy server with many clients sending messages, graceful closes help free up the TCP connections/ports. But when clients don't send 'QUIT', it takes longer for the server to recycle TCP connections/ports. The server can run out of empheral ports, etc.

tmds commented 4 years ago

When a client just closes the TCP connection, the SMTP server doesn't know that right away.

When a client closes the TCP connection, the server's pending receive call returns. It return with a zero byte length indicates the peer closed the connection. The recv can also return as 'ECONNRESET/EPIPE` for 'Connection reset by peer'. This is as 'fast' as sending QUIT.

SmtpClient in .net 2.0 does not close connection to mail server gives some exceptions that can occur.

From the exceptions thrown, it seems the SMTP server/firewall is doing some connection tracking and doesn't like it when connections aren't nicely closed. It may be a security feature.

But when clients don't send 'QUIT', it takes longer for the server to recycle TCP connections/ports. The server can run out of empheral ports, etc.

If that is happening, it's a bug in the server.

MarcoRossignoli commented 4 years ago

FYI RFC treat this specific behaviour https://tools.ietf.org/html/rfc5321#section-4.1.1.5

There are circumstances, contrary to the intent of this
specification, in which an SMTP server may receive an indication that
the underlying TCP connection has been closed or reset.  To preserve
the robustness of the mail system, SMTP servers SHOULD be prepared
for this condition and SHOULD treat it as if a QUIT had been received
before the connection disappeared.

I think that a correct implementation should follow the specification better possible, if I had to implement a smtp server following specification probably I would treat a socket close without QUIT in a different way of a QUIT one, if only to logging perspective.

tmds commented 4 years ago

I'm in favor of SmtpClient being capable of doing an RFC compliant close. My questions are to understand what breaks if it doesn't.

MarcoRossignoli commented 4 years ago

After simple/fast search found some links, but not a lot, maybe @davidsh has got more info/cases https://cr.yp.to/smtp/quit.html https://social.msdn.microsoft.com/Forums/en-US/916333be-f6c2-442a-b89e-bc6870a31c15/smtpclient-quit-command?forum=netfxnetcom

tmds commented 4 years ago

After simple/fast search found some links, but not a lot, maybe @davidsh has got more info/cases

The CLOSE_WAIT issues are from when netfx SmtpClient did not have a Dispose method. .NET Core's SmtpClient does have a Dispose method, which disposes the Socket. That Socket Dispose is enough to avoid the TCP_CLOSE empheral port exhaustion.

scalablecory commented 4 years ago

Not sending QUIT shouldn't be an issue if the socket is otherwise shutdown properly and the server is checking for recv()=0. I can vaguely see a less than robust server behaving unexpectedly when it doesn't receive one and is still processing messages, but we haven't really had reports of incompatibilities so far.

Given that this appears to be non-critical, and that I'd love to avoid spending I/O in a Dispose(), I agree that the most we should do is add a Quit() or Shutdown() method.

mguinness commented 4 years ago

This feels like groundhog day, it caused a lot of issues in .NET Framework 2.0 since SmtpClient not sending QUIT then eventually fixed in .NET Framework 4.0 and still exists in the full framework.

Why it still shows up in the documentation for .NET Core is beyond me as it's not correct and is misleading to say the least.

image

At the end of the day if you aren't going to issue that command in Dispose() please remove that from documentation and please provide a Disconnect() or Shutdown() method so it can be sent manually.

@Tratcher since you were involved in this issue around 10 years back, maybe you have some input?

Tratcher commented 4 years ago

@Tratcher since you were involved in this issue around 10 years back, maybe you have some input?

I was. How did you know?

@scalablecory @tmds @davidsh I know the Dispose design might be a bit outdated, but it's still the best option here.

Being standards compliant here is important, especially since this has been a known source of issues in the wild over a long period. Please address this or allow someone else to do so.

scalablecory commented 4 years ago

@Tratcher I'm not strongly opposed if it's the right thing to do, but I have yet to find an example of this causing an issue. It's not clear that adding QUIT to Dispose() is the best course of action.

Tratcher commented 4 years ago

Here's the original issue and fix: http://vstfdevdiv:8080/DevDiv/Dev10/_queries/edit/703015/?triage=true http://vstfdevdiv:8080/DevDiv/Dev10/_versionControl/changeset/1202965

Customer Scenario (including workaround): Customer creates an SmtpClient object and uses it to send email to an Smtp server. When the underlying Smtp connection times out, it will force-close the connection by sending a RST instead of gracefully closing it by sending a QUIT message as is required by RFC 2822. A customer has no way to deterministically close any open connections; they will only be closed when they time out. There is no workaround.

Description of the Solution: The solution is to implement IDisposable on SmtpClient so that a customer can deterministically free connections. In addition, connections are now always closed by sending a QUIT message under all circumstances except when an Abort is called on an asychronous send. In that case, the state of the connection is indeterminate and a RST must be sent instead to indicate that the send is being cancelled.

The missing QUIT wasn't the primary issue, the more fundamental issue was that the connections weren't being gracefully closed at all, they were timing out in the connection pool and causing RSTs. This can lead to socket exhaustion on the server. QUIT was added for RFC compliance as part of gracefully closing the connections.

Does Core have the same connection lifetime issue?

scalablecory commented 4 years ago

Does Core have the same connection lifetime issue?

No. We do a shutdown()+closesocket() on Dispose().

Tratcher commented 4 years ago

Oh good, that does make this a less severe issue. It also means you already have a non-abortive callstack where it would be appropriate to send the QUIT for RFC compliance so it would be a small fix. https://github.com/dotnet/runtime/blob/9bf91f8d178e09bb81fb1749a82d588cbe8028cf/src/libraries/System.Net.Mail/src/System/Net/Mail/SmtpTransport.cs#L200

SmtpClient has a seperate Abort code path non-graceful disposes.