Steve-Mcl / node-red-contrib-mcprotocol

A Node-red node for communicating with mitsubishi PLCs using MC Protocol over Ethernet
https://github.com/Steve-Mcl/node-red-contrib-mcprotocol
21 stars 18 forks source link

NodeRed crash, when setting MC connection #8

Open jinx17 opened 3 years ago

jinx17 commented 3 years ago

Every time I set connection parameters NodeRed crashes. When I restart service, everything works as expected.

Some info for debugging - from start of node-red to crash, when I changed setting of connection node (PLC connection) PLEASE NOTE I use port 6551, this is not a mistake or error.

` 27 Jan 16:36:49 - [mcprotocol] adding new connection to pool ~ mcprotocol: {host:'10.144.240.22', port: 6551} [2021-01-27T15:36:49.833Z 793782 ] 'plcType' set is A [2021-01-27T15:36:49.839Z 799843 10.144.240.22] 27 Jan 16:36:49 - [info] Started flows 27 Jan 16:36:49 - [mcprotocol] connected ~ mcprotocol: {host:'10.144.240.22', port: 6551} 27 Jan 16:36:51 - [warn] Communication send error: Error: WebSocket is not open: readyState 3 (CLOSED) 27 Jan 16:37:29 - [info] Stopping modified nodes 27 Jan 16:37:29 - [mcprotocol] closing connection ~ mcprotocol: {host:'10.144.240.22', port: 6551} 27 Jan 16:37:29 - [mcprotocol] deleting connection from pool ~ mcprotocol: {host:'10.144.240.22', port: 6551} 27 Jan 16:37:29 - [info] Stopped modified nodes 27 Jan 16:37:29 - [info] Starting modified nodes 27 Jan 16:37:29 - [mcprotocol] adding new connection to pool ~ mcprotocol: {host:'10.144.240.22', port: 6551} [2021-01-27T15:37:29.569Z 529683 ] 'plcType' set is A [2021-01-27T15:37:29.570Z 530977 10.144.240.22] 27 Jan 16:37:29 - [info] Started modified nodes 27 Jan 16:37:29 - [mcprotocol] error ~ mcprotocol: {host:'10.144.240.22', port: 6551}: Error: connect ECONNREFUSED 10.144.240.22:6551 Error: connect ECONNREFUSED 10.144.240.22:6551 at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) { errno: -111, code: 'ECONNREFUSED', syscall: 'connect', address: '10.144.240.22', port: 6551 } Error: connect ECONNREFUSED 10.144.240.22:6551 at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1138:16) { errno: -111, code: 'ECONNREFUSED', syscall: 'connect', address: '10.144.240.22', port: 6551 } [2021-01-27T15:37:29.591Z 551323 10.144.240.22] We Caught a connect error ECONNREFUSED 27 Jan 16:37:29 - [mcprotocol] connection closed ~ mcprotocol: {host:'10.144.240.22', port: 6551} 27 Jan 16:37:29 - [error] [MC Write:Zapiši v PLC] Error: Not connected! 27 Jan 16:37:30 - [error] [MC Write:Zapiši v PLC] Error: Not connected! 27 Jan 16:37:31 - [red] Uncaught Exception: 27 Jan 16:37:31 - Error: read ECONNRESET at TCP.onStreamRead (node:internal/stream_base_commons:213:20)

`

Steve-Mcl commented 3 years ago

Hi, thanks for the feedback. I have had tested this many times without issue. Additionally, i use this node in production where the end device often goes offline. In fairness, I rarely use TCP due to the dreadful reliability of Mitsubishi TCP (ports lock up requiring a PLC power cycle to recover.

Can you tell me a bit about your environment (node js version, node-red version, OS, hardware)

jinx17 commented 3 years ago

Hi, sure, node: v15.5.0 NodeRed: v1.2.6 OS: DietPi HW: RPi4 PLC: I am testing a Mitsubishi FX3U "clone" with ethernet from Amsamotion (China)

Since there was a problem also with timeout settings, I have a feeling this could be linked to the node.js version ( I just realised it is version 15!?) and/or node-red version.

jinx17 commented 3 years ago

I did a node.js downgrade to v12.20.1

I also upgraded node-red to the latest version v1.2.7

Again, when changing settings NodeRed crashes:

28 Jan 08:23:27 - [info] Started modified nodes 28 Jan 08:23:27 - [mcprotocol] error ~ mcprotocol: {host:'10.144.240.22', port: 6551}: Error: connect ECONNREFUSED 10.144.240.22:6551 Error: connect ECONNREFUSED 10.144.240.22:6551 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1144:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '10.144.240.22', port: 6551 } Error: connect ECONNREFUSED 10.144.240.22:6551 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1144:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '10.144.240.22', port: 6551 } [2021-01-28T07:23:27.060Z 50882 10.144.240.22] We Caught a connect error ECONNREFUSED 28 Jan 08:23:27 - [mcprotocol] connection closed ~ mcprotocol: {host:'10.144.240.22', port: 6551} 28 Jan 08:23:29 - [red] Uncaught Exception: 28 Jan 08:23:29 - Error: read ECONNRESET at TCP.onStreamRead (internal/stream_base_commons.js:209:20)

jinx17 commented 3 years ago

I managed to force this crash with same error, but in another way:

28 Jan 08:51:37 - Error: read ECONNRESET at TCP.onStreamRead (internal/stream_base_commons.js:209:20)

How did I reproduce this error: in succsession quickly sending alternating messages ("connect" and "disconnect") to MC-read node a few times resulted in crash.

jinx17 commented 3 years ago

Update: when there are NO MC-READ or MC-WRITE nodes present, only configuration node, changing the settings does not crash node-red!

So it looks it is connected with starting (opening / closing TCP port) connection to the PLC, as the error also suggests.

BTW: at the same time I use Modbus connection to the same PLC, and this node perfectly connects and disconnects or reconnects (on PLC power off) every time.

Steve-Mcl commented 3 years ago

thanks for doing all this investigation. I was going to suggest moving to NODE v14 but you have already downgraded to V12 (should work on V12)

I will try to get a look at this issue ASAP & let you know.

If you discover any more details or clues, please add to the issue.

cheers.

Steve-Mcl commented 3 years ago

Hi, I pushed a fix for this. Please update & test. V1.2.1 now in flow library

How did I reproduce this error: in succsession quickly sending alternating messages ("connect" and "disconnect") to MC-read node a few times resulted in crash.

I have tested this in V1.2.1 - I cannot recreate it - I suspect it was related to the wierd "msecs" issue?

Please test and let me know.

cheers.

jinx17 commented 3 years ago

Hi, unfortunatelly the problem persists.

I can report another observation: When I first disconnect communication (sending msg.topic = "disconnect" to any of mc nodes), I can modify configuration node without problem and comit change in Node-red. At the same time, when change is accepted, mc-read and mc-write nodes get automatically connected again without problem or crash.

When I change configuration node while in connected state, node-red crash is imminent.

So, in my opinion, there is a problem with TCP port connection/disconnection protocol handling (or waiting (timeout) to resolve issues, before moving forward with connection).

observation 1: updating configuration node while connection is active - nodered crash ** Please look at this - before crashing, there is first error stating ECONNREFUSED for all mc-nodes (read and write) 29 Jan 07:29:37 - [mcprotocol] error ~ mcprotocol: {host:'10.144.240.22', port: 6551}: Error: connect ECONNREFUSED 10.144.240.22:6551 Error: connect ECONNREFUSED 10.144.240.22:6551 at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1144:16) { errno: 'ECONNREFUSED', code: 'ECONNREFUSED', syscall: 'connect', address: '10.144.240.22', port: 6551 }

but at the end, we get error 29 Jan 07:29:39 - [red] Uncaught Exception: 29 Jan 07:29:39 - Error: read ECONNRESET at TCP.onStreamRead (internal/stream_base_commons.js:209:20)

*** Observation 2: updating configuration node while not connected nodered OK. ** [2021-01-29T06:23:30.375Z 304978 10.144.240.22] We Caught a read/write error ECONNRESET - resetting connection 29 Jan 07:23:30 - [mcprotocol] error ~ mcprotocol: {host:'10.144.240.22', port: 6551}: Error: read ECONNRESET Error: read ECONNRESET at TCP.onStreamRead (internal/stream_base_commons.js:209:20) { errno: 'ECONNRESET', code: 'ECONNRESET', syscall: 'read' }

jinx17 commented 3 years ago

UPDATE: I changed settings in mcprotocol.js of keepalive to 15 seconds (original is 2.5 seconds = 2500): line 418: self.netClient.setKeepAlive(true, 15000); // For reliable unplug detection in most cases - although it takes 10 minutes to notify

Observe timestamp. Until here, programm is still working - stating "connection closed" at 8:17:25, and then starts process of connection "self.netClient = net.connect(cParam, function ()". KeepAlive now waits 15 seconds now (until 8:17:40), and then it crashes - probably on this command: "self.onTCPConnect.apply(self, arguments);"

There is an uncaught exeption which should be handled somehow to prevent crash. I don't know how :(

[2021-01-29T07:17:25.247Z 177092 10.144.240.22] We Caught a connect error ECONNREFUSED 29 Jan 08:17:25 - [mcprotocol] connection closed ~ mcprotocol: {host:'10.144.240.22', port: 6551} 29 Jan 08:17:40 - [red] Uncaught Exception: 29 Jan 08:17:40 - Error: read ECONNRESET at TCP.onStreamRead (internal/stream_base_commons.js:209:20)

jinx17 commented 3 years ago

IMPORTANT UPDATE:

I managed to get it working (paritaly), in a way - nodered does not crash on setting connection node!

I simply commented out KeepAlive line of code in mcprotocol.js //self.netClient.setKeepAlive(true, 2500); // For reliable unplug detection in most cases - although it takes 10 minutes to notify

New behaviour I see:

So, the culprit is obviously a process handling KeepAlive. Unfortunatelly I am far from beeing able to debug this :( Not a SW programmer at all!

NEW/OLD problem:

THERE HAS TO BE SOME WAY TO HANDLE "Uncaught Exception"

jinx17 commented 3 years ago

EVEN MORE IMPORTANT UPDATE :)

Is this the sollution?

I somehow managed to make it work, and this is how:

Now, I can edit configuration node and deploy without problems. Also msg.topic "disconnect" and "connect" are working. The only thing that isn't the same as before - when changing setting node and deploy, nodes are still disconnected, and I have to manually send a connect msg.topic for it to start communicating.

I read this: https://stackoverflow.com/questions/9191587/how-to-disconnect-from-tcp-socket-in-nodejs and decided to change socket.end to socket.destroy, and it worked.

With END function, program "Half-closes the socket. i.e., it sends a FIN packet. It is possible the server will still send some data."

I DON'T KNOW HOW THIS AFFECTS PROGRAMM EXECUTIONS OR COMPUTER RESOURCES / MEMORY, but in my opinion this shouldn't provide problems.

STILL SOME PROBLEMS:

Steve-Mcl commented 3 years ago

STILL SOME PROBLEMS:

there is no recovery (node-red can not re-connect) when link to the PLC is lost (I disconnect computer from the network). Even changing settings node doesn't help. Looks like the TCP port is somehow blocked and not released.

Unfortunately, in my 14 years dealing with Mitsi hardware, I have yet to find a good solution. I have had occasional success unplugging the Ethernet (from the PLC) and waiting 20 minutes. I have tried triggering the init sequence in the ladder (A and QnA series). I have tried many things over the years. The only real solution is to power cycle the PLC.

I have expressed my concerns to Mitsubishi reqs and technical support - even to senior management at one time.

This is why I now only use UDP - it is far more reliable on the Mitsi PLCs. If you have the option to use UDP 4E then that would be my strong suggestion.

on two places in mcprotocol.js there are these function calls: "self.netClient.end();" I changed them to "self.netClient.destroy();"

I will look into that - thanks very much for taking the time to find a potential solution.

You posts have given me some ideas for testing

I'll let you know what I find.

jinx17 commented 3 years ago

Hi, it would be usefull to get "unhandled exception" reported to info console of Node-red. There has to be a way to do this, but I have no knowledge about this. I think it has to be done in function: MCProtocol.prototype.connectNow (mcprotocol.js). There are only few declared events.

I read somewhere catching unhandled exception should be done this way (but how to use it in a programm?): process.on('uncaughtException', function (err) { console.error((new Date).toUTCString() + ' uncaughtException:', err.message) console.error(err.stack) process.exit(1) })

jinx17 commented 3 years ago

This is really silly :)

I found a way to trick it to work, in a way. It works if I power off / on a PLC, or if I disconnect ethernet cable! I know this is a "bush fix" but maybe it can help you understand / debug the problem. It is obvious there is a problem how sockets are handled. Now I even don't get an unhandled exception, obviously because I destroy socket before it happens - but of course I get a lot of other errors, when mc-write is trying to send data and socket is destroyed, etc. But node-red still works. I also don't know what is happening to the memory on the OS side where node-red is runninng, and a lot of other stuff.

Bush-fix: On every "strange" socket event I simply destroy a socket. In node-red I check if ms-read or mc-write node sends an error message. If it does in 10 seconds a function node sends a topic to connect back to one mc-read node. these are changes in a function MCProtocol.prototype.connectNow = function (cParam, suppressCallback)

        self.netClient.on('error', function () {
            self.netClient.destroy();                                                                            
//          self.connectError.apply(self, arguments);                                                                           
        });

        self.netClient.on('disconnect', function () {
            self.netClient.destroy();                                                                            
//          self.netClient.reconnect();                                                                        
        });

        self.netClient.on('close', function () {
            self.netClient.destroy();                                                                            
//          self.onClientDisconnect.apply(self, arguments);                                                            
        });

        self.netClient.on('timeout', function () {
            self.netClient.destroy();                                                                            
//          self.onClientDisconnect.apply(self, arguments);                                                            
        });

Next thing is obviously to try and destroy only on error, and check if node-red crushes, then add another event with destroy (ie disconnect) and so on...

Steve-Mcl commented 3 years ago

Hey @jinx17 are you still using the above modifications with success?

I am sorry it has been a while, this issue slipped my mind while real life took over.

Let me know how things stand.

Steve-Mcl commented 3 years ago

@jinx17 do you have any further feedback please?

jinx17 commented 3 years ago

Hi Steve, unfortunatelly in the meantime I used this device for other debugging and tests, so I wasn't running node-red for some months now - and I can not use it right now :(. Therefore I can not give you more informations, except the info I wrote here (bush fix, etc) before.