Closed oliv3 closed 5 years ago
@MuruganChandrasekar filed #11 which is supposed to address postfix (in general). Unfortunately, I wanted to improve the error handling. You might look at the changes in #11 and see if that helps. Meanwhile, I hope to finish rebasing that patch and applying it... "soon"....
Here's what I get in the docker-compose output:
grafana_1_e9d04bb0896d | t=2019-01-23T22:43:13+0000 lvl=info msg="Sending alert notification to" logger=alerting.notifier.email addresses=[olivier@biniou.info]
grafana_1_e9d04bb0896d | t=2019-01-23T22:43:13+0000 lvl=eror msg="Failed to send alert notification email" logger=alerting.notifier.email error="SMTP not configured, check your grafana.ini config file's [smtp] section."
grafana_1_e9d04bb0896d | t=2019-01-23T22:43:13+0000 lvl=eror msg="failed to send notification" logger=alerting.notifier id=0 error="SMTP not configured, check your grafana.ini config file's [smtp] section."
Grafana/latest is 5.4.3, so that's a more recent version than the one I got when using this project for the first time (IIRC it used at that time to successfully send mail alerts)
Still, doesn't explain why sending mail from the command line also fails..
Don't know if it's related, but hope it can be useful :)
root@d97675a38e1b:/# /etc/my_init.d/postfix.sh
* Stopping Postfix Mail Transport Agent postfix [ OK ]
* Starting Postfix Mail Transport Agent postfix
postfix: Postfix is running with backwards-compatible default settings
postfix: See http://www.postfix.org/COMPATIBILITY_README.html for details
postfix: To disable backwards compatibility use "postconf compatibility_level=2" and "postfix reload"
postfix/postfix-script: fatal: the Postfix mail system is already running
[fail]
@terrillmoore OK hold on, found two issues, will report here for analysis.
@MuruganChandrasekar filed #11 which is supposed to address postfix (in general). Unfortunately, I wanted to improve the error handling. You might look at the changes in #11 and see if that helps. Meanwhile, I hope to finish rebasing that patch and applying it... "soon"....
I'm looking into this and I'll update you when it is done.
So we have two issues here:
Using default configuration, sending a test alert from Grafana fails:
grafana_1_1bd29478925c | t=2019-01-24T15:58:43+0000 lvl=info msg="Sending alert notification to" logger=alerting.notifier.email addresses=[olivier@biniou.info]
grafana_1_1bd29478925c | t=2019-01-24T15:58:43+0000 lvl=eror msg="Failed to send alert notification email" logger=alerting.notifier.email error="x509: certificate is valid for 27ae0e316043, not kaalut.dogfooding.net"
grafana_1_1bd29478925c | t=2019-01-24T15:58:43+0000 lvl=eror msg="failed to send notification" logger=alerting.notifier id=0 error="x509: certificate is valid for 27ae0e316043, not kaalut.dogfooding.net"
/var/log/mail.log
in the Postfix container:
Jan 24 15:58:43 3f20bedddab5 postfix/smtpd[153]: connect from unknown[172.18.0.1]
Jan 24 15:58:43 3f20bedddab5 postfix/smtpd[153]: SSL_accept error from unknown[172.18.0.1]: 0
Jan 24 15:58:43 3f20bedddab5 postfix/smtpd[153]: warning: TLS library problem: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate:s3_pkt.c:1487:SSL alert number 42:
Jan 24 15:58:43 3f20bedddab5 postfix/smtpd[153]: lost connection after STARTTLS from unknown[172.18.0.1]
Jan 24 15:58:43 3f20bedddab5 postfix/smtpd[153]: disconnect from unknown[172.18.0.1] ehlo=1 starttls=0/1 commands=1/2
This happened because TTN_DASHBOARD_GRAFANA_SMTP_SKIP_VERIFY
was not set in the .env
file, resulting in GF_SMTP_SKIP_VERIFY= "false"
, thus erroring with SSL certificate validation.
From "x509: certificate is valid for 27ae0e316043, not kaalut.dogfooding.net"
I guess there's some issue about certificate generation (sounds like some docker container id (maybe Apache's ?), but I'm not sure).
Setting TTN_DASHBOARD_GRAFANA_SMTP_SKIP_VERIFY=true
makes things a little better, Grafana doesn't complain anymore:
grafana_1_1bd29478925c | t=2019-01-24T16:02:15+0000 lvl=info msg="Sending alert notification to" logger=alerting.notifier.email addresses=[olivier@biniou.info]
. Still, no mails get sent.
TTN_DASHBOARD_MAIL_RELAY_IP
was also not set, so getting "." as default, thus resulting in mail not being sent, as we can see in Postfix's /var/log/mail.log
:
Jan 24 16:20:58 132788baa5cf postfix/smtpd[168]: connect from unknown[172.18.0.1]
Jan 24 16:20:58 132788baa5cf postfix/smtpd[168]: 57AE5196A38: client=unknown[172.18.0.1]
Jan 24 16:20:58 132788baa5cf postfix/cleanup[172]: 57AE5196A38: message-id=<>
Jan 24 16:20:58 132788baa5cf postfix/qmgr[152]: 57AE5196A38: from=<grafana@kaalut.dogfooding.net>, size=34621, nrcpt=1 (queue active)
Jan 24 16:20:58 132788baa5cf postfix/smtpd[168]: disconnect from unknown[172.18.0.1] ehlo=2 starttls=1 mail=1 rcpt=1 data=1 quit=1 commands=7
Jan 24 16:20:58 132788baa5cf postfix/smtp[173]: fatal: valid hostname or network address required in server description: .
Jan 24 16:20:59 132788baa5cf postfix/qmgr[152]: warning: private/smtp socket: malformed response
Jan 24 16:20:59 132788baa5cf postfix/qmgr[152]: warning: transport smtp failure -- see a previous warning/fatal/panic logfile record for the problem description
Jan 24 16:20:59 132788baa5cf postfix/master[145]: warning: process /usr/lib/postfix/sbin/smtp pid 173 exit status 1
Jan 24 16:20:59 132788baa5cf postfix/master[145]: warning: /usr/lib/postfix/sbin/smtp: bad command startup -- throttling
Jan 24 16:20:59 132788baa5cf postfix/error[174]: 57AE5196A38: to=<olivier@biniou.info>, relay=none, delay=1.2, delays=0.03/1.1/0/0.12, dsn=4.3.0, status=deferred (unknown mail transport error)
Changing relay_ip
in docker-compose.yml
solved the issue:
- relay_ip: "${TTN_DASHBOARD_MAIL_RELAY_IP:-.}"
+ relay_ip: "${TTN_DASHBOARD_MAIL_RELAY_IP:-}"
Please bear in mind that I'm in no way a Postfix expert, but things look OK now, at least I receive the test alerts from Grafana.
I'll gladly send a PR for 2., for 1. I don't know. Maybe if there's some fix regarding certificates that can be done, then Grafana could use SSL when talking to Postfix, or the default value of GF_SMTP_SKIP_VERIFY
could be changed. Your call.
I'd say that documentation would need an update, but you should write this better than I, since English is not my mother tongue.
Thanks for the great research. PRs would be much appreciated.
Regarding point 1, I'm not 100% sure I understand the detailed problem, but: since Grafana <> Postfix is a direct (local) connection, there's no need to use SSL -- they can trust each other, that's an important feature of Docker Compose. That's also why Grafana isn't using SSL for the client-facing connections, it trusts Apache, who then does the SSL for everybody.
Agreed, then GF_SMTP_HOST
should default to postfix, not localhost, right ?
Or not. It's exposed.
It shouldn't be exposed; it's a docker-compose local network. Grafana should be using host address 'postfix', which will be found in /etc/hosts on the grafana container. There should be a network connection on port 25 available between Grafana and Postfix set up by the Docker Compose file.
Postfix should not be exposing anything to the outside world unless for some reason you want incoming mail to this server. I don't recommend that. Because I got stuck with the error handling in #11, I didn't have a chance to get things working and then inspect things from the point of view of security.
Definitely agreed, but that's what is done at the moment. Can fix this while I'm at it.
Can fix this while I'm at it.
That would be great.
Postfix has TLS enabled by default (which is a good thing).
From /etc/postfix/main.cf
:
smtpd_use_tls=yes
So Grafana will use TLS (unless GF_SMTP_SKIP_VERIFY
is set to true)
Will have to figure out how to disable this in Postfix, probably not so hard to do, I hope.
In postfix/Dockerfile
, adding
+run postconf -e smtpd_use_tls=no
seems to do the trick.
Hi, can someone help me troubleshoot this issue ? I'm running latest code from master, everything works fine but sending mail. I set up a Grafana channel to send alerts to my address, when trying to send a test message from Grafana I don't receive anything. (Nothing reaching my mail server) I'm not a Postfix expert, and troubleshooting Postfix running in a container is even harder..
Here's the
.env
file I'm using (replacing my domain with foobar.net):Going into the container with some
But here stop my postfix skills... I have no clue how to debug this. Sending a mail to myself from inside the container using the
mail
command doesn't work either, so I think it's probably Postfix going wrong rather than Grafana..Thanks in advance,