plcpeople / nodeS7

Node.JS library for communication to Siemens S7 PLCs
MIT License
363 stars 122 forks source link

`onISOConnectReply` throwing "write after end" #46

Closed gfcittolin closed 6 years ago

gfcittolin commented 6 years ago

Hi all,

A problem has been reported on netsmarttech/node-red-contrib-s7, where node eventually throws with the following stack trace:

30 Oct 11:42:37 - [red] Uncaught Exception:
30 Oct 11:42:37 - Error: write after end
    at writeAfterEnd (_stream_writable.js:193:12)
    at Socket.Writable.write (_stream_writable.js:244:5)
    at Socket.write (net.js:658:40)
    at NodeS7.onISOConnectReply (/home/pi/.node-red/node_modules/nodes7/nodeS7.js:306:17)
    at Socket.<anonymous> (/home/pi/.node-red/node_modules/nodes7/nodeS7.js:262:26)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:176:18)
    at Socket.Readable.push (_stream_readable.js:134:10)
    at TCP.onread (net.js:548:20)

I haven't done a full analysis/reproduction yet, but from the stack trace we can already guess the onISOConnectReply is trying to write on a socket that has been already closed (isoclient.end() has been called)

One of the possibilities I've thought about is, if dropConnection() is called, isoclient.end() is also called and the PLC may eventually send some data. If that would be the state where we're waiting for the ISO connect reply, this might happen.

Have you ever seen this?

plcpeople commented 6 years ago

I have not seen this before but all our own applications "run forever", they never call dropConnection and will reconnect automatically if the connection is dropped.

Reading a bit about this I think it is bad practice to just call isoclient.end() without waiting for any active reads/writes to complete, maybe it "usually" works depending on timing but not always. I think the real fix is to go to a dropPending state when dropConnection is called, and when all packets are done sending, then call isoclient.end(). (Or check the state of the socket at least)

Was the user that reported this able to reproduce the issue regularly?

plcpeople commented 6 years ago

We saw an issue similar to this, and I believe it is due to the "on error" handler being removed/disabled and not turned back on until a reconnection is attempted, which can cause this error. Hopefully that is the end of this issue.