cyrusimap / cyrus-imapd

Cyrus IMAP is an email, contacts and calendar server
http://cyrusimap.org
Other
530 stars 145 forks source link

smtpclient: more logging when SMTP send fails #2477

Open robn opened 6 years ago

robn commented 6 years ago

It would be good if Cyrus would log the command and response any time a SMTP command fails.

(Right now in JMAP we're getting a smtpProtocolError in response to certain EmailSubmission/set calls, and there's nothing to say what's really going on).

elliefm commented 6 years ago

It looks like it will log something like this any time an SMTP command returns other than what it expected:

https://github.com/cyrusimap/cyrus-imapd/blob/d7eee1f384d45d3a6165142672af2113bac5fa9e/imap/smtpclient.c#L223-L224

But for it to log this, it has to succeed at both constructing and sending the command, and then reading and parsing the response. If any other type of thing goes wrong, it just returns an error number up the stack.

So I wonder: are you seeing "unexpected response" log lines, but can't tell what command provoked them? Or are you not seeing these log lines, in which case it failed before obtaining an smtp response code, and we need more aggressive error logging?

This is my first time sticking my beak into the smtpclient code, so I don't have a gut feel yet of "oh yeah we just need to add a syslog call here". @rsto might have such a gut feel, I think he did a bunch of work on this recently.

robn commented 6 years ago

The particular case we saw was a too-large attachment. We saw the EmailSubmission/set method in the Cyrus httpd log, but only saw a connect & disconnect from Postfix. It's possible that Postfix just dropped the connection, but even then, Cyrus saying "uhh?" would be good.

(I can post logs for you, but they having nothing of interest in them).

elliefm commented 6 years ago

Ah, yeah, so looking at code, I think would've just returned an IMAP_IOERROR up the stack. I wonder where Cyrus would have noticed the disconnect? If it noticed it while sending the command, it'd still know what it was trying to do, and be able to report that. If it noticed it while reading the response, it would either have the read context (and could report the smtp response, or lack thereof) or the write context (and could report the command that was executed), depending on where in the call chain the syslog call occurred, but not both. And I imagine it'd be super annoying to log every command just in case it later fails! 😅

robmueller commented 6 years ago

Is there a way to enable telemetry logging for the smtp engine in some way so you can see the full smtp transaction that cyrus did?

elliefm commented 6 years ago

That's a good question! It uses the lib/prot.c API for all its I/O, and I believe the telemetry implementation comes from that library, so it seems plausible at least!

elliefm commented 6 years ago

Okay, so the various service processes use the telemetry_log() function in lib/telemetry.c to set up a telemetry log for the authenticated user. Under the hood, this basically creates an fd and passes it through to prot_setlog(), and then the prot api does the rest.

So, I'm not currently sure if the telemetry_ api is usable by smtpclient or not, but even if it's not, it should be possible for it to manually prot_setlog() itself when needed.

elliefm commented 6 years ago

Remind me, what's the trigger for enabling telemetry logging for a user? It's just creating the $configdir/log/userid directory, yeah? I can't see a config option for it, looks like it entirely depends on whether the open() succeeds or fails, which depends on whether the directory in the path already exists or not...

Aaaaaand, what do you know:

https://github.com/cyrusimap/cyrus-imapd/blob/a8b4c913991ecdf463b157e306cfce88a5f1f5e9/imap/smtpclient.c#L704

https://github.com/cyrusimap/cyrus-imapd/blob/a8b4c913991ecdf463b157e306cfce88a5f1f5e9/imap/smtpclient.c#L827

Looks like if you create a "$configdir/log/smtpclient.host/" or "$configdir/log/smtpclient.sendmail/" directory (depending on value of smtp_backend imapd.conf option, default "sendmail"), you will get telemetry for smtpclient sessions

The system works 😀

elliefm commented 6 years ago

Not sure why those line-number links didn't expand inline like they usually do, but they show smtpclient.c already calls telemetry_log() :)

[Oh I see, I hadn't linked to a specific revision. Fixed!]

robmueller commented 6 years ago

Yep, it worked. I created .../log/smtpclient.sendmail/ directory, sent an email and got:

---------- smtpclient.sendmail Thu Aug 23 10:14:35 2018

>1535019276>EHLO localhost
<1535019276<250-cyrus_sendmail
<1535019276<250-ENHANCEDSTATUSCODES
<1535019276<250-SIZE 73000000
<1535019276<250 AUTH PLAIN
>1535019276>MAIL FROM:<test1@robmtest.vm> AUTH=test1@robmtest.vm SIZE=345
<1535019276<250 ok
>1535019276>RCPT TO:<foo@bar.com>
<1535019276<250 2.1.5 ok
>1535019276>DATA
<1535019276<354 go ahead
>1535019276>Message-Id: <be150af8-2d05-4b89-a934-195678411eb3@slotv1t01>
User-Agent: Cyrus-JMAP/3.1.5-310-g0f18a51-fmfuture-20180820v5
x-jmap-identity-id: 25
Date: Thu, 23 Aug 2018 10:14:37 +0000
From: Test User <test1@robmtest.vm>
To: foo <foo@bar.com>
Subject: asdgasdg
Content-Type: text/plain

sdgasdgag

-- 
  test user
  test1@robmtest.vm>1535019276>
.
<1535019276<250 2.0.0 sendmail completed
>1535019276>QUIT
<1535019276<221 bye
elliefm commented 6 years ago

@robn is being able to pull telemetry adequate? Or do we still need more logging for understanding the cases we can't reproduce under telemetry?

elliefm commented 4 years ago

Just pushed 5946dfee3 and a45e524be to master, which is loggier than we had! Hopefully it's enough to see the shape of what's going on next time we need it, and we can log more detail later if/when we know what more-detail would be useful?