Closed gordonparikh closed 1 year ago
Attempted to replicate issue on staging server (NDOTIRIS03). Initially no issues as the controller was online so polling was successful. Changing the IP address to an unused one (to replicate the controller being offline) and waiting a polling cycle caused the issue to emerge. Restarting the server (without changing the IP from the "offline" one) did not clear the issue, but restarting and restoring connectivity did.
Issue appears to require hitting a "connect timed out" error, at which point polling will break permanently until connectivity is restored and the server is restarted (in that order).
Potential fix c1fa4cf829003073df533db2d5000f7ee06cecd6 tested on development machine and staging. Appears to fix issue but will perform some additional testing.
Added more null pointer checking in 5817596, and overall exception handling to generic polling code in dd1d803.
Rare conditions seem to be able to cause polling of Gate NDORv5 devices (and potentially others) to stop working properly. A server restart is then required to resume normal polling. Issue was observed on D6-80-126-EB. This seems to be related to Issue #41.
Details on the issue:
General Notes
Polling History Review
Code Review
Polling Operation Drops
Error Handling
Null Pointer Exceptions