SRF-Consulting-Group-Inc / iris

Intelligent Roadway Information System
GNU General Public License v2.0
2 stars 0 forks source link

Gate polling stops working correctly #42

Closed gordonparikh closed 1 year ago

gordonparikh commented 2 years ago

Rare conditions seem to be able to cause polling of Gate NDORv5 devices (and potentially others) to stop working properly. A server restart is then required to resume normal polling. Issue was observed on D6-80-126-EB. This seems to be related to Issue #41.

Details on the issue:

General Notes

Polling History Review

Code Review

Polling Operation Drops

Error Handling

Null Pointer Exceptions

gordonparikh commented 2 years ago

Attempted to replicate issue on staging server (NDOTIRIS03). Initially no issues as the controller was online so polling was successful. Changing the IP address to an unused one (to replicate the controller being offline) and waiting a polling cycle caused the issue to emerge. Restarting the server (without changing the IP from the "offline" one) did not clear the issue, but restarting and restoring connectivity did.

Issue appears to require hitting a "connect timed out" error, at which point polling will break permanently until connectivity is restored and the server is restarted (in that order).

gordonparikh commented 2 years ago

Potential fix c1fa4cf829003073df533db2d5000f7ee06cecd6 tested on development machine and staging. Appears to fix issue but will perform some additional testing.

gordonparikh commented 2 years ago

Added more null pointer checking in 5817596, and overall exception handling to generic polling code in dd1d803.