haraka / Haraka

A fast, highly extensible, and event driven SMTP server
https://haraka.github.io
MIT License
5.09k stars 662 forks source link

Crashing in current master code if socket times out #1936

Closed superman20 closed 7 years ago

superman20 commented 7 years ago

I decided to give the current master code (actually, the cluster_graceful_stuff branch) a whirl on one of my production installations yesterday. It looks like Haraka will crash when the outbound socket times out upon connection. Below is a little of the log before the crash. The log starts right at the moment outbound is sending for delivery. Note that I also have the following in outbound.ini:

pool_timeout=0
pool_concurrency_max=0

Log with stack dump:

[03:18:40] [DEBUG] [-] [outbound] running send_email hooks
[03:18:40] [DEBUG] [-] [outbound] Sending mail: 1495523920895_1495523920895_0_8000_KjYwWl_4_app1933
[03:18:40] [DEBUG] [-] [outbound] running get_mx hooks
[03:18:40] [DEBUG] [-] [outbound] running get_mx hook in rcpt_to.routes plugin
[03:18:40] [INFO] [-] [outbound] hook=get_mx plugin=rcpt_to.routes function=get_mx params="mydomain.com" retval=OK msg="mail.mydomain.com:25"
[03:18:40] [DEBUG] [-] [outbound] Got an MX from Plugin: mydomain.com => 0 mail.mydomain.com:25
[03:18:40] [INFO] [-] [outbound] Looking up A records for: mail.mydomain.com
[03:18:40] [INFO] [-] [outbound] Attempting to deliver to: 1.2.3.4:25(0) (0)
[03:18:40] [DEBUG] [-] [core] [outbound] host=1.2.3.4 port=25pool_timeout=0 created
[03:19:01] [CRIT] [-] [core] TypeError: Cannot read property 'outbound::25:1.2.3.4:undefined:0' of undefined
[03:19:01] [CRIT] [-] [core]     at pluggableStream.<anonymous> (C:\Haraka\npm\node_modules\Haraka\outbound\client_pool.js:22:30)
[03:19:01] [CRIT] [-] [core]     at Object.onceWrapper (events.js:293:19)
[03:19:01] [CRIT] [-] [core]     at emitOne (events.js:96:13)
[03:19:01] [CRIT] [-] [core]     at pluggableStream.emit (events.js:191:7)
[03:19:01] [CRIT] [-] [core]     at Socket.<anonymous> (C:\Haraka\npm\node_modules\Haraka\tls_socket.js:80:14)
[03:19:01] [CRIT] [-] [core]     at Object.onceWrapper (events.js:293:19)
[03:19:01] [CRIT] [-] [core]     at emitOne (events.js:96:13)
[03:19:01] [CRIT] [-] [core]     at Socket.emit (events.js:191:7)
[03:19:01] [CRIT] [-] [core]     at emitErrorNT (net.js:1279:8)
[03:19:02] [CRIT] [-] [core]     at _combinedTickCallback (internal/process/next_tick.js:80:11)
[03:19:02] [NOTICE] [-] [core] Shutting down

Note the 21 second gap between when it says "Attempting to deliver" and the crash. Normally, there is no delay. Also, 21 seconds is the default Windows timeout for making a socket connection. I can repeat this by enabling a firewall at the outbound destination.

Let me explain my setup, just in case it is important. I run Haraka in what I call a "security gateway" mode. I run 1 instance of Haraka on a public IP for which it receives (and sends) mail for several domains. Each domain is located in a different physical location for which they run a local e-mail server. Delivery to the local domains is run through the Outbound queue (via this) because connectivity to some of them is unreliable and I'd prefer Haraka to accept the mail and deliver it when the connection is working. The 3 crashes I got were during the phase of delivering an email to the local domain server (via outbound).

Let me know if I can provide any additional diagnostic info. I've been through the client pool code some, but am not yet familiar enough to figure out a fix.

baudehlo commented 7 years ago

It just needs a condition on the pool existing like other parts of the code do:

if (server.notes.pool && server.notes.pool[name]) {

On Tue, May 23, 2017 at 11:11 AM, superman20 notifications@github.com wrote:

I decided to give the current master code (actually, the cluster_graceful_stuff branch) a whirl on one of my production installations yesterday. It looks like Haraka will crash when the outbound socket times out upon connection. Below is a little of the log before the crash. The log starts right at the moment outbound is sending for delivery. Note that I also have the following in outbound.ini:

pool_timeout=0 pool_concurrency_max=0

Log with stack dump:

[03:18:40] [DEBUG] [-] [outbound] running send_email hooks [03:18:40] [DEBUG] [-] [outbound] Sending mail: 1495523920895_1495523920895_0_8000_KjYwWl_4_app1933 [03:18:40] [DEBUG] [-] [outbound] running get_mx hooks [03:18:40] [DEBUG] [-] [outbound] running get_mx hook in rcpt_to.routes plugin [03:18:40] [INFO] [-] [outbound] hook=get_mx plugin=rcpt_to.routes function=get_mx params="mydomain.com" retval=OK msg="mail.mydomain.com:25" [03:18:40] [DEBUG] [-] [outbound] Got an MX from Plugin: mydomain.com => 0 mail.mydomain.com:25 [03:18:40] [INFO] [-] [outbound] Looking up A records for: mail.mydomain.com [03:18:40] [INFO] [-] [outbound] Attempting to deliver to: 1.2.3.4:25(0) (0) [03:18:40] [DEBUG] [-] [core] [outbound] host=1.2.3.4 port=25pool_timeout=0 created [03:19:01] [CRIT] [-] [core] TypeError: Cannot read property 'outbound::25:1.2.3.4:undefined:0' of undefined [03:19:01] [CRIT] [-] [core] at pluggableStream. (C:\Haraka\npm\node_modules\Haraka\outbound\client_pool.js:22:30) [03:19:01] [CRIT] [-] [core] at Object.onceWrapper (events.js:293:19) [03:19:01] [CRIT] [-] [core] at emitOne (events.js:96:13) [03:19:01] [CRIT] [-] [core] at pluggableStream.emit (events.js:191:7) [03:19:01] [CRIT] [-] [core] at Socket. (C:\Haraka\npm\node_modules\Haraka\tls_socket.js:80:14) [03:19:01] [CRIT] [-] [core] at Object.onceWrapper (events.js:293:19) [03:19:01] [CRIT] [-] [core] at emitOne (events.js:96:13) [03:19:01] [CRIT] [-] [core] at Socket.emit (events.js:191:7) [03:19:01] [CRIT] [-] [core] at emitErrorNT (net.js:1279:8) [03:19:02] [CRIT] [-] [core] at _combinedTickCallback (internal/process/next_tick.js:80:11) [03:19:02] [NOTICE] [-] [core] Shutting down

Note the 21 second gap between when it says "Attempting to deliver" and the crash. Normally, there is no delay. Also, 21 seconds is the default Windows timeout for making a socket connection. I can repeat this by enabling a firewall at the outbound destination.

Let me explain my setup, just in case it is important. I run Haraka in what I call a "security gateway" mode. I run 1 instance of Haraka on a public IP for which it receives (and sends) mail for several domains. Each domain is located in a different physical location for which they run a local e-mail server. Delivery to the local domains is run through the Outbound queue (via this https://gist.github.com/smfreegard/ff79d02aeb94b9065359) because connectivity to some of them is unreliable and I'd prefer Haraka to accept the mail and deliver it when the connection is working. The 3 crashes I got were during the phase of delivering an email to the local domain server (via outbound).

Let me know if I can provide any additional diagnostic info. I've been through the client pool code some, but am not yet familiar enough to figure out a fix.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haraka/Haraka/issues/1936, or mute the thread https://github.com/notifications/unsubscribe-auth/AAobYzbShzEyNEgzArg1rpstUiGJz48wks5r8vc0gaJpZM4Nj1_m .

superman20 commented 7 years ago

Thanks. Not that there was any doubt, but I have confirmed that this does indeed fix the issue.