haraka / Haraka

A fast, highly extensible, and event driven SMTP server
https://haraka.github.io
MIT License
5.02k stars 662 forks source link

Save the SMTP conversation #945

Closed msimerson closed 8 years ago

msimerson commented 9 years ago

I think I'd like to store a summary of the SMTP conversation, assembled as an array on the connection, something like this:

<-  220 mail.theartfarm.com ESMTP Haraka 2.6.2-msimerson ready
 -> EHLO rmbp.local
<-  250-mail.theartfarm.com Hello wsip-70-168-115-6.ks.ks.cox.net [70.168.115.6], Haraka is at your service.
<-  250-PIPELINING
<-  250-8BITMIME
<-  250-SIZE 35000000
<-  250 STARTTLS
 -> STARTTLS
<-  220 Go ahead.
 ~> EHLO rmbp.local
<~  250-mail.theartfarm.com Hello wsip-70-168-115-6.ks.ks.cox.net [70.168.115.6], Haraka is at your service.
<~  250-PIPELINING
<~  250-8BITMIME
<~  250-SIZE 35000000
<~  250 AUTH PLAIN LOGIN CRAM-MD5
 ~> AUTH CRAM-MD5
<~  334 *********
 ~> ******************==
<~  235 Authentication successful
 ~> MAIL FROM:<matt@tnpi.net>
<~  250 sender <matt@tnpi.net> OK
 ~> RCPT TO:<matt@tnpi.net>
<~  250 recipient <matt@tnpi.net> OK
 ~> DATA
<~  354 go ahead, make my day
<~  250 ok 1430153916 qp 82679 (1A11ED1E-51CF-4512-B1B2-BF8CB1179CC6.1)
 ~> QUIT
<~  221 mail.theartfarm.com closing connection. Have a jolly good day.

such that it could be assembled into a string like so connection.conv.join('\n');

Having this in ES would be quite helpful for support purposes.

Thoughts?

baudehlo commented 9 years ago

Core or plugin? It should be easy to do in a plugin. There's connection.original_string.

On Mon, Apr 27, 2015 at 2:54 PM, Matt Simerson notifications@github.com wrote:

I think I'd like to store a summary of the SMTP conversation, assembled as an array on the connection, something like this:

<- 220 mail.theartfarm.com ESMTP Haraka 2.6.2-msimerson ready -> EHLO rmbp.local <- 250-mail.theartfarm.com Hello wsip-70-168-115-6.ks.ks.cox.net [70.168.115.6], Haraka is at your service. <- 250-PIPELINING <- 250-8BITMIME <- 250-SIZE 35000000 <- 250 STARTTLS -> STARTTLS <- 220 Go ahead. ~> EHLO rmbp.local <~ 250-mail.theartfarm.com Hello wsip-70-168-115-6.ks.ks.cox.net [70.168.115.6], Haraka is at your service. <~ 250-PIPELINING <~ 250-8BITMIME <~ 250-SIZE 35000000 <~ 250 AUTH PLAIN LOGIN CRAM-MD5 ~> AUTH CRAM-MD5 <~ 334 ***** ~> **== <~ 235 Authentication successful ~> MAIL FROM:matt@tnpi.net <~ 250 sender matt@tnpi.net OK ~> RCPT TO:matt@tnpi.net <~ 250 recipient matt@tnpi.net OK ~> DATA <~ 354 go ahead, make my day <~ 250 ok 1430153916 qp 82679 (1A11ED1E-51CF-4512-B1B2-BF8CB1179CC6.1) ~> QUIT <~ 221 mail.theartfarm.com closing connection. Have a jolly good day.

such that it could be assembled into a string like so connection.conv.join('\n');

Having this in ES would be quite helpful for support purposes.

Thoughts?

— Reply to this email directly or view it on GitHub https://github.com/baudehlo/Haraka/issues/945.

smfreegard commented 9 years ago

You probably want to store the pre and post transaction conversation in cxn.notes and the transaction stuff in txn.notes and then merge the two before you write the transaction data to ES otherwise if you get >1 transaction per connection, you'll end up with a potentially large structure that contains all of the transactions which I suspect you probably don't want.

msimerson commented 9 years ago

Core or plugin?

I was thinking core.

connection.original_string

That's only half the conversation. I'm not interested in the DATA (as snipped above) or the AUTH details (as snipped above), just the "we said, they said" outline. It might also help capture errors that aren't handled well such as a STARTTLS negotiation that fails, plugin crashes, and that sort of issue. One of the primary reasons I want this is to make it easy for SMTP literate support staff to have a TXT blob they can read like swaks output. I would send it to ES as a string, and exclude it from being indexed. Having this would in many cases, avoid having to go log spelunking.

I didn't think this would be "easy" to do in a plugin, because of next(OK), plugin ordering, having to hook the world, and catching all the responses that are sent.

baudehlo commented 9 years ago

Ok as long as it's configurable. I don't want high traffic servers storing more in ram than they need to.

On Apr 27, 2015, at 7:00 PM, Matt Simerson notifications@github.com wrote:

Core or plugin?

I was thinking core.

connection.original_string

That's only half the conversation. I'm not interested in the DATA (as snipped above) or the AUTH details (as snipped above), just the "we said, they said" outline. It might also help capture errors that aren't handled well such as a STARTTLS negotiation that fails, plugin crashes, and that sort of issue. One of the primary reasons I want this is to make it easy for SMTP literate support staff to have a TXT blob they can read like swaks output. I would send it to ES as a string, and exclude it from being indexed. Having this would in many cases, avoid having to go log spelunking.

I didn't think this would be "easy" to do in a plugin, because of next(OK), plugin ordering, and having to hook the world

— Reply to this email directly or view it on GitHub.

smfreegard commented 9 years ago

I can see this being pretty useful for debugging some of those edge cases that are difficult to trace in production from just the log output alone and because we can't currently set the loglevel to something different for just a single session - this would be the next best thing.

Might I suggest you make it so that it can be enabled globally (e.g. for what you want it for) via something like echo 1 > config/record_session_transcripts OR via a settable connection variable e.g. connection.record_session_transcript = true, so that this could be enabled on a per connection basis by a plugin?

celesteking commented 9 years ago

The use case would be as following:

msimerson commented 9 years ago

My use case

smfreegard commented 9 years ago

staff determines and identifies the issue (go update your SPF record, your server is blacklisted, our mail server has an issue, etc...).

What is the point of storing connection.results and logging the world - if you can't tell why a message was blocked or rejected using it?

Both of you are doing it wrong if you cant instantly determine why a message was blocked using one field per recipient and one field for the overall message (e.g. connection.last_reject) in ES.

Storing the SMTP conversation is useful for the vanishingly rare cases that there might be some weird issue. e.g. TLS failures, Path MTU issue causing the session to timeout, badly written SMTP client or bug in Haraka that causes a synchronization failure. In 10 years I can count the times on one hand when I've needed to trace the SMTP session to find an issue.

Storing every SMTP session of every connection is fine if you don't care how big your ES indexes get or about insert performance (both in terms of Haraka and ES). e.g. as the logging plugin has to keep the session open to log this data before Haraka can destroy the connection object.

celesteking commented 9 years ago

Yeah, it's for those rare cases.

msimerson commented 9 years ago

it's for those rare cases.

From the perspective of support workers who typically deal with only the connections that ran afoul, those "rare" cases may not seem all that rare. This request originated in our support department. If adding 28 bytes of extra data to each SMTP document in ES helps them half as much as they think it will, and the cost is maybe an extra ES or Haraka server in high volume environments, that's an easy price to pay.

msimerson commented 8 years ago

Moved to idea section of the wiki.