Closed dennisdupont closed 7 years ago
Hi,
I am having a similar issue. When our haraka server is under load (100k msgs at hour), suddenly shutdown with this trace:
2017-03-21T17:35:17.506Z [CRIT] [-] [core] at pluggableStream.write (/usr/local/lib/node_modules/Haraka/tls_socket.js:109:34)
2017-03-21T17:35:17.506Z [CRIT] [-] [core] at ReadStream.ondata (_stream_readable.js:555:20)
2017-03-21T17:35:17.506Z [CRIT] [-] [core] at emitOne (events.js:101:20)
2017-03-21T17:35:17.506Z [CRIT] [-] [core] at ReadStream.emit (events.js:188:7)
Error UncaughtException:
Error: This socket is closed
at Socket._writeGeneric (net.js:680:19)
at Socket._write (net.js:731:8)
at doWrite (_stream_writable.js:334:12)
at writeOrBuffer (_stream_writable.js:320:5)
at Socket.Writable.write (_stream_writable.js:247:11)
at Socket.write (net.js:658:40)
at pluggableStream.write (/usr/local/lib/node_modules/Haraka/tls_socket.js:109:34)
at ReadStream.ondata (_stream_readable.js:555:20)
at emitOne (events.js:101:20)
at ReadStream.emit (events.js:188:7)2017-03-21T17:35:17.508Z [NOTICE] [-] [core] Shutting down
# haraka --version
Haraka.js — Version: 2.8.13
We are using haraka as a router to our pool of smarthosts running postfix.
We are using get_mx
hook to choose what smarthost to use. Following documentation, we configure pool_timeout=0
but was slow processing mails.
So we change to pool_timeout=100
.
Right now our server is in production, and shutting down multiple times a day.
Why haraka is shutting down, because of the other end closing connection. Is there a way to catch this error and not shut down the server?
Best Regards,
Roberto
Yesterday we were processing 500k emails, and the server shut down 5 times.
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at ReadStream.emit (events.js:188:7)
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at readableAddChunk (_stream_readable.js:176:18)
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at ReadStream.Readable.push (_stream_readable.js:134:10)
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at onread (fs.js:2004:12)
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at FSReqWrap.wrapper [as oncomplete] (fs.js:681:17)
Error UncaughtException:
Error: This socket has been ended by the other party
at Socket.writeAfterFIN [as write] (net.js:291:12)
at pluggableStream.write (/usr/local/lib/node_modules/Haraka/tls_socket.js:109:34)
at ReadStream.ondata (_stream_readable.js:555:20)
at emitOne (events.js:101:20)
at ReadStream.emit (events.js:188:7)
at readableAddChunk (_stream_readable.js:176:18)
at ReadStream.Readable.push (_stream_readable.js:134:10)
at onread (fs.js:2004:12)
at FSReqWrap.wrapper [as oncomplete] (fs.js:681:17)2017-03-21T20:52:06.888Z [NOTICE] [-] [core] Shutting down
We have a really bad situation here. Please help!
This sounds pretty bad. There are clearly bugs in the pooling code of outbound. I'm working right now on splitting up the outbound code so it's a lot easier to work on, but fixing this won't be done soon.
The only recommendation I have for now is perhaps send all outbound mail through another local MTA. Either a much earlier version of Haraka (before outbound pool support arrived) or something like Postfix or Exim.
On Thu, Mar 23, 2017 at 8:43 AM, Roberto Matute notifications@github.com wrote:
Shutting down under pressure
Yesterday we were processing 500k emails, and the server shut down 5 times.
2017-03-21T20:52:06.887Z [CRIT] [-] [core] at ReadStream.emit (events.js:188:7) 2017-03-21T20:52:06.887Z [CRIT] [-] [core] at readableAddChunk (_stream_readable.js:176:18) 2017-03-21T20:52:06.887Z [CRIT] [-] [core] at ReadStream.Readable.push (_stream_readable.js:134:10) 2017-03-21T20:52:06.887Z [CRIT] [-] [core] at onread (fs.js:2004:12) 2017-03-21T20:52:06.887Z [CRIT] [-] [core] at FSReqWrap.wrapper [as oncomplete] (fs.js:681:17) Error UncaughtException: Error: This socket has been ended by the other party at Socket.writeAfterFIN [as write] (net.js:291:12) at pluggableStream.write (/usr/local/lib/node_modules/Haraka/tls_socket.js:109:34) at ReadStream.ondata (_stream_readable.js:555:20) at emitOne (events.js:101:20) at ReadStream.emit (events.js:188:7) at readableAddChunk (_stream_readable.js:176:18) at ReadStream.Readable.push (_stream_readable.js:134:10) at onread (fs.js:2004:12) at FSReqWrap.wrapper [as oncomplete] (fs.js:681:17)2017-03-21T20:52:06.888Z [NOTICE] [-] [core] Shutting down
We have a really bad situation here. Please help!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/haraka/Haraka/issues/1853#issuecomment-288706854, or mute the thread https://github.com/notifications/unsubscribe-auth/AAobY4cXTpNVeBu3Bh4Odz_9dfX0mdrgks5romj7gaJpZM4MaIK1 .
@baudehlo thanks for your response.
Actually we use haraka to connect to a postfix smarthost. I mean the arquitecture is something like:
app -> haraka (plugin with get_mx hook) -> postfix -> Internet
We are using haraka to balance the load between postfix servers, based on some rules (queue size, disk space, from address) for each postfix server.
We will change our version to a pre-pooling version.
Sorry to promote another product here but I'd suggest looking at ZoneMTA https://github.com/zone-eu/zone-mta We use it to send out 500K messages every day with peaks around 80 msg/sec. Not as mature as Haraka or Postfix though.
We already change our haraka to 2.8.4
. It handled 600k emails in 3 hours. I believe the problem was indeed in the pool code.
system info
Expected behavior
Expected Haraka to continue running despite a socket being ended from the far end.
Observed behavior
2017-03-10T16:21:07.766Z [INFO] [-] [core] [outbound] Processing domain: xxx.com 2017-03-10T16:21:07.767Z [PROTOCOL] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] S: 250 2.1.0 Sender OK\r\n 2017-03-10T16:21:07.767Z [PROTOCOL] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] C: DATA 2017-03-10T16:21:07.767Z [PROTOCOL] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] S: 250 2.0.0 Resetting\r\n 2017-03-10T16:21:07.767Z [PROTOCOL] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] S: 421 4.7.0 Too many protocol errors (6) on this connection, closing transmission channel.\r\n 2017-03-10T16:21:07.768Z [PROTOCOL] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] C: RSET 2017-03-10T16:21:07.768Z [DEBUG] [-] [core] [outbound] release_client: 216.32.180.74:25 to undefined 2017-03-10T16:21:07.768Z [INFO] [-] [core] [outbound] [outbound::25:216.32.180.74:undefined:60] dispense() clients=0 available=7 2017-03-10T16:21:07.768Z [DEBUG] [-] [core] Temp fail for: Upstream error: 421 4.7.0 Resetting Too many protocol errors (6) on this connection, closing transmission channel. 2017-03-10T16:21:07.768Z [DEBUG] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] running deferred hooks 2017-03-10T16:21:07.768Z [INFO] [439792B7-CFEF-4221-B3AC-1B6B42A4A8C9.35.1] [outbound] Temp failing 1489162857819_0_1709_1060.xxx.net for 64 seconds: Upstream error: 421 4.7.0 Resetting Too many protocol errors (6) on this connection, closing transmission channel. 2017-03-10T16:21:07.769Z [WARN] [-] [core] [outbound] Socket [outbound::25:216.32.180.74:undefined:60] in pool got FIN 2017-03-10T16:21:07.769Z [DEBUG] [-] [core] [outbound] destroying pool entry for 216.32.180.74:25 2017-03-10T16:21:07.772Z [CRIT] [-] [core] Error: This socket has been ended by the other party 2017-03-10T16:21:07.772Z [CRIT] [-] [core] at TLSSocket.writeAfterFIN [as write] (net.js:291:12) 2017-03-10T16:21:07.772Z [CRIT] [-] [core] at pluggableStream.write (/usr/lib/node_modules/Haraka/tls_socket.js:109:34) 2017-03-10T16:21:07.772Z [CRIT] [-] [core] at ReadStream.ondata (_stream_readable.js:555:20) 2017-03-10T16:21:07.772Z [CRIT] [-] [core] at emitOne (events.js:101:20) 2017-03-10T16:21:07.772Z [CRIT] [-] [core] at ReadStream.emit (events.js:188:7) 2017-03-10T16:21:07.773Z [CRIT] [-] [core] at readableAddChunk (_stream_readable.js:176:18) 2017-03-10T16:21:07.773Z [CRIT] [-] [core] at ReadStream.Readable.push (_stream_readable.js:134:10) 2017-03-10T16:21:07.773Z [CRIT] [-] [core] at onread (fs.js:2004:12) 2017-03-10T16:21:07.773Z [CRIT] [-] [core] at FSReqWrap.wrapper [as oncomplete] (fs.js:681:17) 2017-03-10T16:21:07.774Z [NOTICE] [-] [core] Shutting down
Steps to reproduce