plcpeople / nodeS7

Node.JS library for communication to Siemens S7 PLCs
MIT License
358 stars 121 forks source link

Sequence number collision between read and write packets #91

Closed gfcittolin closed 4 years ago

gfcittolin commented 4 years ago

We've been experiencing multiple and periodic connection drops in some setups, and after analyzing and simulating the issue, we found out that there can be sequence number collisions between read and write packets. When this happens, the response packet can get assigned to the wrong request (e.g. a write response is identified as a read response or vice-versa), and the other request times out, dropping the connection.

This is clearly verifiable in these logs:

...
[431645,347044484 192.168.15.211 S1] Preparing to WRITE W5 to value false
[431645,347411669 192.168.15.211 S1] Clearing write PacketTimeouts
[431645,348011926 192.168.15.211 S1] Sending Write Packet With Sequence Number 32765
[431645,356699946 192.168.15.211 S1] Received 1 bytes of S7-data from PLC.  Sequence number is 32765
[431645,356971872 192.168.15.211 S1] Time is 0 seconds and 9.4 ms.
[431645,357077288 192.168.15.211 S1] Received all packets in writeResponse
[431645,357149683 192.168.15.211 S1] DB100,X5.0 write completed with quality OK
[431645,357275724 192.168.15.211 S1] We are calling back our writeDoneCallback.
[431645,429899817 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431645,430330752 192.168.15.211 S1] Calling SRP from RAI
[431645,430447522 192.168.15.211 S1] SendReadPacket called
[431645,430878457 192.168.15.211 S1] Sending Read Packet
[431645,438805909 192.168.15.211 S1] Received 10 bytes of S7-data from PLC.  Sequence number is 2
[431645,439044293 192.168.15.211 S1] ReadResponse called
[431645,439204448 192.168.15.211 S1] Time is 0 seconds and 8.6 ms.
[431645,439331687 192.168.15.211 S1] Address DB100,X0.0 has value true and quality OK
[431645,439466061 192.168.15.211 S1] Address DB100,X1.0 has value false and quality OK
[431645,439541113 192.168.15.211 S1] Address DB100,X2.0 has value false and quality OK
[431645,439615123 192.168.15.211 S1] Address DB100,X3.0 has value false and quality OK
[431645,439687101 192.168.15.211 S1] Address DB100,X4.0 has value false and quality OK
[431645,439759809 192.168.15.211 S1] Address DB100,X5.0 has value false and quality OK
[431645,439843663 192.168.15.211 S1] We are calling back our readDoneCallback.
[431645,818854864 192.168.15.211 S1] Preparing to WRITE R2 to value true
[431645,819438663 192.168.15.211 S1] Clearing write PacketTimeouts
[431645,820227200 192.168.15.211 S1] Sending Write Packet With Sequence Number 32766
[431645,827194032 192.168.15.211 S1] Received 1 bytes of S7-data from PLC.  Sequence number is 32766
[431645,827467833 192.168.15.211 S1] Time is 0 seconds and 7.8 ms.
[431645,827578093 192.168.15.211 S1] Received all packets in writeResponse
[431645,827652623 192.168.15.211 S1] DB100,X2.0 write completed with quality OK
[431645,827762623 192.168.15.211 S1] We are calling back our writeDoneCallback.
[431645,827835122 192.168.15.211 S1] Preparing to WRITE R3 to value true
[431645,828185954 192.168.15.211 S1] Clearing write PacketTimeouts
[431645,828733763 192.168.15.211 S1] Sending Write Packet With Sequence Number 32767
[431645,830104275 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431645,830321357 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431645,837611834 192.168.15.211 S1] Received 1 bytes of S7-data from PLC.  Sequence number is 32767
[431645,837892041 192.168.15.211 S1] Time is 0 seconds and 9.5 ms.
[431645,838005269 192.168.15.211 S1] Received all packets in writeResponse
[431645,838080529 192.168.15.211 S1] DB100,X3.0 write completed with quality OK
[431645,838172508 192.168.15.211 S1] We are calling back our writeDoneCallback.
[431645,838244955 192.168.15.211 S1] Preparing to WRITE W4 to value true
[431645,838608495 192.168.15.211 S1] Clearing write PacketTimeouts
[431645,839170054 192.168.15.211 S1] Sending Write Packet With Sequence Number 1
[431645,847700575 192.168.15.211 S1] Received 1 bytes of S7-data from PLC.  Sequence number is 1
[431645,847992084 192.168.15.211 S1] Time is 0 seconds and 9.2 ms.
[431645,848099844 192.168.15.211 S1] Received all packets in writeResponse
[431645,848191145 192.168.15.211 S1] DB100,X4.0 write completed with quality OK
[431645,848259322 192.168.15.211 S1] We are calling back our writeDoneCallback.
[431645,848330624 192.168.15.211 S1] Preparing to WRITE W5 to value true
[431645,848678278 192.168.15.211 S1] Clearing write PacketTimeouts
[431645,849245410 192.168.15.211 S1] Sending Write Packet With Sequence Number 2
[431645,858119783 192.168.15.211 S1] Received 1 bytes of S7-data from PLC.  Sequence number is 2
[431645,858385615 192.168.15.211 S1] ReadResponse called
[431645,858491448 192.168.15.211 S1] WARNING: Received a read response packet that was not marked as sent
[431645,931104291 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431645,931394133 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,32192274 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,32461543 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,133218903 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,133496923 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,235747712 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,236049741 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,336414239 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,336672727 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,437023215 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,437306651 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,538080417 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,538377395 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,639184442 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,639479545 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,739817376 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,740162478 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,840663174 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,840926297 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431646,941263973 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431646,941572252 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,41942063 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,42223311 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,142665517 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,142950776 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,243743240 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,244019592 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,344409663 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,344671328 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,445097805 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,445393741 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,545805687 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,546097039 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,646884452 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,647166221 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,747632906 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,747923008 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,848435735 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,848712035 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431647,949158564 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431647,949443094 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431648,49943945 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431648,50282693 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431648,150643493 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431648,150916148 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431648,251703039 192.168.15.211 S1] Reading All Items (readAllItems was called)
[431648,251997829 192.168.15.211 S1] Waiting to read for all R/W operations to complete.  Will re-trigger readAllItems in 100ms.
[431648,348308706 192.168.15.211 S1] PacketTimeout called with type write and seq 2
[431648,348586725 192.168.15.211 S1] WRITE TIMEOUT on sequence number 2
[431648,348694225 192.168.15.211 S1] ConnectionReset from write packet timeout.
[431648,348798078 192.168.15.211 S1] ConnectionReset has been called to set the reset as pending
[431648,348899848] We must have timed Out - we have no response to process
[431648,348971619] Stopping Processing Write Response Packet due to unrecoverable packet error
[431648,349068025 192.168.15.211 S1] Time is 2 seconds and 500.2 ms.
[431648,349156930 192.168.15.211 S1] Received all packets in writeResponse
[431648,349243232 192.168.15.211 S1] DB100,X5.0 write completed with quality BAD
[431648,349315575 192.168.15.211 S1] Calling reset from writeResponse as there is one pending
[431648,349470782 192.168.15.211 S1] ResetNOW is happening
[431648,349561042 192.168.15.211 S1] Connection cleanup is happening
[431648,350220674 192.168.15.211 S1] Clearing read PacketTimeouts
[431648,350399475 192.168.15.211 S1] Clearing write PacketTimeouts
[431648,351509104 192.168.15.211 S1] <initiating a new connection Mon Dec 23 2019 12:52:40 GMT-0300 (GMT-03:00)>
[431648,351709623 192.168.15.211 S1] Attempting to connect to host...
...

Tracking down the issue, the sequence number of read packets are determined by the time they're being prepared, and from there on, they're always sent with the same sequence number (usually 2, if a read request was the first action). On the other side, the write packets have a new sequence number assigned each time a write request is performed. So, after many writes, the sequence number overflows and wraps to 1, and by the next write it has the same sequence number as the read request. Because the responses are identified by their sequence number and a response is tested for a read request first, the write one never gets acknowledged and times out, as seen on the logs.

The suggested fix (that I'm currently working on, actually) is to set a new sequence number to read packets just before sending them (instead of at creation time), the same way write packets do, eliminating the possibility of collisions.

plcpeople commented 4 years ago

Thanks for the fix, that looks much improved - I will publish it later in the day.