foxcpp / maddy

✉️ Composable all-in-one mail server.
https://maddy.email
GNU General Public License v3.0
4.94k stars 239 forks source link

DSNs not being received #327

Closed reivilibre closed 3 years ago

reivilibre commented 3 years ago

Describe the bug

I don't appear to receive DSNs for any of the following reasons:

Oops, should have tested it :sweat_smile: :).

Steps to reproduce

Log files

submission: incoming message        {"msg_id":"2f6f32e9","sender":"me@home.net","src_host":"[192.168.1.10]","src_ip":"xxx:47740","username":"me@home.net"}
submission: RCPT ok        {"msg_id":"2f6f32e9","rcpt":"nobody@example.com"}
submission: accepted        {"msg_id":"2f6f32e9"}
queue: delivery attempt failed        {"msg_id":"2f6f32e9","rcpt":"nobody@example.com","reason":"Domain does not accept email (null MX)","smtp_code":556,"smtp_enchcode":"5.1.10","smtp_msg":"Domain does not accept email (null MX)"}
queue: not delivered, permanent error        {"msg_id":"2f6f32e9","rcpt":"nobody@example.com"}
queue: generated failed DSN        {"dsn_id":"b77c542c","msg_id":"2f6f32e9"}
msgpipeline: no check action        {"check":"verify_dkim","msg_id":"b77c542c","reason":"No DKIM signatures","smtp_code":550,"smtp_enchcode":"5.7.20","smtp_msg":"No DKIM signatures"}

Configuration file

https://hastebin.com/raw/leriyabube

Environment information

Notable config changes: I use Dovecot for mailbox handling, could this be related?

foxcpp commented 3 years ago

The log message contains the msg_id assigned to the DSN message - b77c542c. Are there any log entries referencing this id?

reivilibre commented 3 years ago

@foxcpp I'm afraid not, that's the end of the log. :/

reivilibre commented 3 years ago

Ah, I'm sorry, the version I specified is wrong. There was a mistake when installing the built maddy (the bin path seems to have changed in build/?). Anyway

maddy -v
maddy 0.4.3+cd1d527 linux/amd64 go1.14.7

(phew, the version I expected).

The behaviour is much the same. Here are some fresh logs — there is an entry concerning the DSN now:

submission: incoming message        {"msg_id":"fc1d3dd9","sender":"me@home.net","src_host":"[192.168.1.20]","src_ip":"84.92.98.201:48228","username":"me@home.net"}
submission: RCPT ok        {"msg_id":"fc1d3dd9","rcpt":"nobody@other.com"}
submission: accepted        {"msg_id":"fc1d3dd9"}
queue: delivery attempt failed        {"msg_id":"fc1d3dd9","rcpt":"nobody@other.com","reason":"\u003cnobody@other.com\u003e: Recipient address rejected: User unknown in virtual mailbox table","remote_server":"mail.other.com.","smtp_code":550,"smtp_enchcode":"5.1.1","smtp_msg":"mail.other.com. said: \u003cnobody@other.com\u003e: Recipient address rejected: User unknown in virtual mailbox table","target":"remote"}
queue: not delivered, permanent error        {"msg_id":"fc1d3dd9","rcpt":"nobody@other.com"}
queue: generated failed DSN        {"dsn_id":"9df9ad3d","msg_id":"fc1d3dd9"}
msgpipeline: no check action        {"check":"check.dkim","msg_id":"9df9ad3d","reason":"No DKIM signatures","smtp_code":550,"smtp_enchcode":"5.7.20","smtp_msg":"No DKIM signatures"}
foxcpp commented 3 years ago

Okay, could you enable debug logging globally (add "debug" directive at the top of config)? This should give more information how DSN is handled.

reivilibre commented 3 years ago

Here is a debug log when resending the message (I also rm -r remote_queue before restarting to try and keep it clean).

submission: incoming message        {"msg_id":"17fec351","sender":"me@home.net","src_host":"[192.168.1.20]","src_ip":"x:48258","username":"me@home.net"}
[debug] smtp/pipeline: sender me@home.net matched by domain rule 'home.net'        {"msg_id":"17fec351"}
[debug] smtp/pipeline: global rcpt modifiers: nobody@example.com => nobody@example.com        {"msg_id":"17fec351"}
[debug] smtp/pipeline: per-source rcpt modifiers: nobody@example.com => nobody@example.com        {"msg_id":"17fec351"}
[debug] smtp/pipeline: recipient nobody@example.com matched by default rule (clean = nobody@example.com)        {"msg_id":"17fec351"}
[debug] smtp/pipeline: per-rcpt modifiers: nobody@example.com => nobody@example.com        {"msg_id":"17fec351"}
[debug] smtp/pipeline: tgt.Start(me@home.net) ok, target = queue:remote_queue        {"msg_id":"17fec351"}
submission: RCPT ok        {"msg_id":"17fec351","rcpt":"nobody@example.com"}
[debug] autobuffer: keeping the message in RAM (read 10 bytes, got EOF)
[debug] modify.dkim: signed        {"domain":"home.net"}
[debug] smtp/pipeline: delivery.Body ok, Delivery object = *msgpipeline.delivery        {"msg_id":"17fec351"}
submission: accepted        {"msg_id":"17fec351"}
[debug] queue: starting delivery for 17fec351
[debug] queue: waiting on delivery semaphore for 17fec351
[debug] queue: delivery semaphore acquired for 17fec351
[debug] submission: reset
[debug] queue: using message ID = 17fec351-6000aa9f        {"msg_id":"17fec351"}
[debug] queue: target.Start OK        {"msg_id":"17fec351"}
[debug] remote: opening new connection        {"cache_ignored":false,"domain":"example.com","msg_id":"17fec351-6000aa9f"}
[debug] queue: delivery.AddRcpt nobody@example.com failed: Domain does not accept email (null MX)        {"msg_id":"17fec351"}
[debug] queue: delivery.Abort (no accepted receipients)        {"msg_id":"17fec351"}
[debug] queue: errors: map[nobody@example.com:Domain does not accept email (null MX)]        {"msg_id":"17fec351"}
queue: delivery attempt failed        {"msg_id":"17fec351","rcpt":"nobody@example.com","reason":"Domain does not accept email (null MX)","smtp_code":556,"smtp_enchcode":"5.1.10","smtp_msg":"Domain does not accept email (null MX)"}
queue: not delivered, permanent error        {"msg_id":"17fec351","rcpt":"nobody@example.com"}
queue: generated failed DSN        {"dsn_id":"feb67180","msg_id":"17fec351"}
[debug] queue/pipeline: sender  matched by default rule        {"msg_id":"feb67180"}
[debug] queue/pipeline: global rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] queue/pipeline: per-source rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] queue/pipeline: recipient me@home.net matched by domain rule 'home.net'        {"msg_id":"feb67180"}
[debug] queue/pipeline: per-rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] msgpipeline: initializing state for check.require_mx_record: (0xc000234640)        {"msg_id":"feb67180"}
[debug] msgpipeline: initializing state for check.dkim: (0xc000210900)        {"msg_id":"feb67180"}
[debug] msgpipeline: initializing state for check.spf: (0xc0002960b0)        {"msg_id":"feb67180"}
[debug] msgpipeline: sender  matched by default rule        {"msg_id":"feb67180"}
[debug] queue/pipeline: tgt.Start() ok, target = msgpipeline:local_routing        {"msg_id":"feb67180"}
[debug] msgpipeline: global rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] msgpipeline: per-source rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] msgpipeline: recipient me@home.net matched by domain rule 'home.net'        {"msg_id":"feb67180"}
[debug] msgpipeline: per-rcpt modifiers: me@home.net => me@home.net        {"msg_id":"feb67180"}
[debug] target.lmtp: connected        {"downstream_server":"","msg_id":"feb67180"}
[debug] target.lmtp: connected        {"msg_id":"feb67180","remote_server":""}
[debug] msgpipeline: tgt.Start() ok, target = target.lmtp:local_mailboxes        {"msg_id":"feb67180"}
[debug] check.dkim: no signatures present        {"msg_id":"feb67180"}
msgpipeline: no check action        {"check":"check.dkim","msg_id":"feb67180","reason":"No DKIM signatures","smtp_code":550,"smtp_enchcode":"5.7.20","smtp_msg":"No DKIM signatures"}
reivilibre commented 3 years ago

Dovecot has the following logs at the time:

lmtp(1248553): Connect from local
imap(me@home.net)<><> Connection closed (IDLE running for 0.002 + waiting input for 0.001 secs, 2 B in + 10+10 B out, state=wait-input) in=4172 ou>
imap-login: Login: user=<me@home.net>, method=PLAIN, rip=x, lip=79.143.178.141, mpid=1248673, TLS, session=<9qmAK+K4BsZUXGLJ>

(the lmtp line is at the same time as all the lines in Maddy's log. I think the IMAP lines are my Thunderbird session and not related.)

I don't see any log lines regarding receipt of a message, or of Sieve filters (which it seems to log about normally).

I disabled my account's Sieve rules in case that affected anything and no difference was noticed.

foxcpp commented 3 years ago
[debug] msgpipeline: tgt.Start() ok, target = target.lmtp:local_mailboxes        {"msg_id":"feb67180"}

This particular log record indicates that maddy successfully sends MAIL FROM command to Dovecot and receives a positive response. Not sure what happens next though. It might be hanging while trying to send RCPT TO to Dovecot since there does not seem to be any success or failure message after that. Can you do killall -TRAP maddy after failed delivery and post output somewhere?

reivilibre commented 3 years ago

Thanks, I have posted the output here:

output.txt

reivilibre commented 3 years ago

I decided to also watch what was happening on the LMTP UNIX socket

root@sallie:/var/run/dovecot# socat -t100 -x -v UNIX-LISTEN:lmtp-maddy,mode=777,reuseaddr,fork UNIX-CONNECT:lmtp-maddy.orig
< 2021/01/16 16:49:11.216413  length=50 from=0 to=49
 32 32 30 20 73 61 6c 6c 69 65 2e 6c 69 62 72 65  220 sallie.*****
 70 75 73 68 2e 6e 65 74 20 44 6f 76 65 63 6f 74  ****.net Dovecot
 20 28 55 62 75 6e 74 75 29 20 72 65 61 64 79 2e   (Ubuntu) ready.
 0d 0a                                            ..
--
> 2021/01/16 16:49:11.217082  length=27 from=0 to=26
 4c 48 4c 4f 20 73 61 6c 6c 69 65 2e 6c 69 62 72  LHLO sallie.****
 65 70 75 73 68 2e 6e 65 74 0d 0a                 *****.net..
--
< 2021/01/16 16:49:11.217501  length=109 from=50 to=158
 32 35 30 2d 73 61 6c 6c 69 65 2e 6c 69 62 72 65  250-sallie.*****
 70 75 73 68 2e 6e 65 74 0d 0a                    ****.net..
 32 35 30 2d 38 42 49 54 4d 49 4d 45 0d 0a        250-8BITMIME..
 32 35 30 2d 43 48 55 4e 4b 49 4e 47 0d 0a        250-CHUNKING..
 32 35 30 2d 45 4e 48 41 4e 43 45 44 53 54 41 54  250-ENHANCEDSTAT
 55 53 43 4f 44 45 53 0d 0a                       USCODES..
 32 35 30 2d 50 49 50 45 4c 49 4e 49 4e 47 0d 0a  250-PIPELINING..
 32 35 30 20 53 54 41 52 54 54 4c 53 0d 0a        250 STARTTLS..
--
> 2021/01/16 16:49:11.218409  length=28 from=27 to=54
 4d 41 49 4c 20 46 52 4f 4d 3a 3c 3e 20 42 4f 44  MAIL FROM:<> BOD
 59 3d 38 42 49 54 4d 49 4d 45 0d 0a              Y=8BITMIME..
--
< 2021/01/16 16:49:11.218976  length=14 from=159 to=172
 32 35 30 20 32 2e 31 2e 30 20 4f 4b 0d 0a        250 2.1.0 OK..
--
> 2021/01/16 16:49:11.219279  length=33 from=55 to=87
 52 43 50 54 20 54 4f 3a 3c 6f 6c 69 76 69 65 72  RCPT TO:<*******
 40 6c 69 62 72 65 70 75 73 68 2e 6e 65 74 3e 0d  @*********.net>.
 0a                                               .
--
< 2021/01/16 16:49:11.220727  length=14 from=173 to=186
 32 35 30 20 32 2e 31 2e 35 20 4f 4b 0d 0a        250 2.1.5 OK..
--

So it looks like Dovecot is responding fine (to me, at least).

foxcpp commented 3 years ago

(hastebin.com seems to be down so I am guessing based on the log messages) Your config seems to apply inbound security checks to DSNs (which is a mistake I made in default config some time ago). Moving check { ... } from local_routing into smtp tcp://0.0.0.0:25 like it is done in current default maddy.conf version should fix the problem for you: https://github.com/foxcpp/maddy/blob/5e9d9100db5c74204132e4dea56acc6f2221d04c/maddy.conf#L79-L91

The issue itself seems to be that generated DSN message somehow trips check.spf code and delivery hangs between RCPT TO and DATA:

Here is the check.spf goroutine doing something and waiting forever:

 goroutine 48 [chan receive, 2633 minutes]:
 github.com/foxcpp/maddy/internal/check/spf.(*state).CheckBody(0xc00026c050, 0xf41300, 0xc00009dc50, 0xc0000fe180, 0x9, 0x10, 0xc00059e270, 0xf3e740, 0xc00000e1a0, 0x0, ...)
         github.com/foxcpp/maddy/internal/check/spf/spf.go:366 +0x112
 github.com/foxcpp/maddy/internal/msgpipeline.(*checkRunner).checkBody.func1(0xf43b40, 0xc00026c050, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
         github.com/foxcpp/maddy/internal/msgpipeline/check_runner.go:279 +0xb4
 github.com/foxcpp/maddy/internal/msgpipeline.(*checkRunner).runAndMergeResults.func1(0xc00007e230, 0xf43b40, 0xc00026c050, 0xc0000fe300, 0xc0004d42d0)
         github.com/foxcpp/maddy/internal/msgpipeline/check_runner.go:180 +0x75
 created by github.com/foxcpp/maddy/internal/msgpipeline.(*checkRunner).runAndMergeResults
         github.com/foxcpp/maddy/internal/msgpipeline/check_runner.go:179 +0x107

And here is the delivery goroutine waiting for check.spf to complete, once again, forever:

goroutine 50 [semacquire, 2633 minutes]:
sync.runtime_Semacquire(0xc0000fe36c)
        runtime/sema.go:56 +0x42
sync.(*WaitGroup).Wait(0xc0000fe36c)
        sync/waitgroup.go:130 +0x64
github.com/foxcpp/maddy/internal/msgpipeline.(*checkRunner).runAndMergeResults(0xc0004d42d0, 0xc00009dd10, 0x3, 0x3, 0xc00007e230, 0x10, 0xc00059e270)
        github.com/foxcpp/maddy/internal/msgpipeline/check_runner.go:220 +0x135
github.com/foxcpp/maddy/internal/msgpipeline.(*checkRunner).checkBody(0xc0004d42d0, 0xf41300, 0xc00009dc50, 0xc0002037a0, 0x3, 0x3, 0xc0000fe180, 0x9, 0x10, 0xc00059e270, ...)
        github.com/foxcpp/maddy/internal/msgpipeline/check_runner.go:278 +0x192
github.com/foxcpp/maddy/internal/msgpipeline.(*msgpipelineDelivery).Body(0xc0004fe200, 0xf41300, 0xc00009dc50, 0xc0000fe180, 0x9, 0x10, 0xc00059e270, 0xf3e740, 0xc00000e1a0, 0xc00059e270, ...)
        github.com/foxcpp/maddy/internal/msgpipeline/msgpipeline.go:370 +0x105
github.com/foxcpp/maddy/internal/msgpipeline.(*msgpipelineDelivery).Body(0xc0004fe100, 0xf41300, 0xc00009dc50, 0xc0000fe180, 0x9, 0x10, 0xc00059e270, 0xf3e740, 0xc00000e1a0, 0x0, ...)
        github.com/foxcpp/maddy/internal/msgpipeline/msgpipeline.go:413 +0x6ce
github.com/foxcpp/maddy/internal/target/queue.(*Queue).emitDSN(0xc000215ee0, 0xc00011e460, 0xc0004fe300, 0xd, 0x20, 0xc000202750, 0xc00058c0d0, 0x1, 0x1)
        github.com/foxcpp/maddy/internal/target/queue/queue.go:973 +0xcc0
github.com/foxcpp/maddy/internal/target/queue.(*Queue).tryDelivery(0xc000215ee0, 0xc00011e460, 0xc0004fe300, 0xd, 0x20, 0xc000202750, 0xf3e700, 0xc00049e180)
        github.com/foxcpp/maddy/internal/target/queue/queue.go:424 +0xc84
github.com/foxcpp/maddy/internal/target/queue.(*Queue).dispatch.func1(0xc000215ee0, 0xc00020a2d8, 0x8, 0xc00011e460, 0xc000212390, 0xf3e700, 0xc00049e180)
        github.com/foxcpp/maddy/internal/target/queue/queue.go:330 +0x2fe
created by github.com/foxcpp/maddy/internal/target/queue.(*Queue).dispatch
        github.com/foxcpp/maddy/internal/target/queue/queue.go:290 +0x1e5
reivilibre commented 3 years ago

@foxcpp Thanks, that was spot on! Making the change you suggested (yes, the checks were in the local_routing block) has fixed it and I get a DSN now :).

foxcpp commented 3 years ago

Note to self: This might be related to #322.

foxcpp commented 3 years ago

d275cd044fbe865bb92058ff09ff0f76adb08703 fixes the underlying problem (apparently, two problems).