thegridelectric / gw-scada-spaceheat-python

GridWorks SCADA for space heating
MIT License
5 stars 2 forks source link

Remove race condition in tracking 'fully subscribed' #132

Closed anschweitzer closed 1 year ago

anschweitzer commented 1 year ago

The data structures tracking whether all subscriptions had been acked was modified in two threads without locking and, unsurprisingly, a race condition sometimes caused failure to resolve comm state. We saw this failure in CI often enough to be aggravated. These data structures are now only modified in the main message processing thread.

The most commonly observed symptom was that FragmentRunner.await_connect() would fail and message such as "waiting for gridworks connect" would be present. The specific observed cause was that response to the subscription message (and therefore the call to on_subscribe(mid)) arrived before the paho client subscribe() call had even returned so the data structure contain the sent message id was still empty in on_subscribe(mid) and the suback was therefore ignored.