vapor-ware / synse-modbus-ip-plugin

Modbus over TCP/IP plugin for Synse
GNU General Public License v3.0
2 stars 3 forks source link

bugfix: second attempt at re-establishing manager connection on read failure #46

Closed edaniszewski closed 4 years ago

edaniszewski commented 4 years ago

This PR:

It not totally clear to me that this would work, but my thought is that by doing the reconnect after the loop, it would get all managers with errors to reset. Although I think it should have been doing this before, there may have been a leak in that the defer call was being made within the loop, so I'm not actually sure that all resets were being called.

I still want to take more time to look through this and see if I can identify why this error is happening, but just opening this PR for initial review.

MatthewHink commented 4 years ago

One option might be to reboot an EGauge in Wrigley and see if synse v2 recovers. If so, the issue is likely in pkg/devices/common.go

The diff shows me that this is not a straight port to synse v3, this is a massive refactor as well where a lot of things got changed around.

Showing with 3,584 additions and 7,541 deletions: https://github.com/vapor-ware/synse-modbus-ip-plugin/compare/master...v3/staging?expand=1

edaniszewski commented 4 years ago

updated the device manager to no longer cache the client. I suspect the fact that it has been doing thing could be the reason we're seeing stale connection issues that werent resetting. The v2 build does not cache and I haven't seen this problem arise there, so I think this should work. It does make various tests more difficult to write, so I had to disable some. Will circle back later to try and figure out how to write those tests.

Will attempt to test this in a deployment tomorrow morning.

edaniszewski commented 4 years ago

Tested these changes on a v3 deployment and it looks like it fixes the issue.

edaniszewski commented 4 years ago

A version of this branch has been deployed for a few days now and it appears to have cleared up the issue we were seeing with stale client connection. I believe this is ready for merge and a new version bump.