Closed NickWaterton closed 3 years ago
I have been having difficulties with this for a few days now. I have one light that gives errors quite often, and so have implemented a retry mechanism.
However, there is a problem. When an error message is received, like this:
<?xml version="1.0"?><Event seqnum="264" sid="uuid:116"><control>ERR</control><action>1</action><node>22 17 16 1</node><eventInfo></eventInfo></Event>
This triggers a status_event
notification like this:
[2020-12-11 09:58:19,067][DEBUG](pyisy ) CHANGE Notification Received {'address': '22 17 16 1', 'status': 0, 'last_changed': datetime.datetime(2020, 12, 11, 9, 58, 19, 67420), 'last_update': datetime.datetime(2020, 12, 11, 9, 58, 19, 67401)}
However the status value is incorrect. this is because when you send the node.turn_off()
command, the "hint" mechanism updates the node._status
value to 0, even though the status of the node has not actually changed.
So, if you send a turn_off()
command, and there is an error, you get an error notification, which triggers a changed event, with status 0, indication that the light has turned off, when in fact what you are receiving is an error notification that the light did not respond. The same happens with turn_on()
or any other command. You can see how this is a problem.
With the new Asyncronous library, I can actually see no use for the hint mechanism at all. Either the light turns on/off and the status gets updated when it responds, or it doesn't, in which case the status is correct not to change.
I have disabled the hint update mechanism in node
, and everything now works as expected. The status reflect the actual ight status, and a retry mechanism is easy to implement, and does in fact work perfectly with my problem light that needs 2 or 3 retries before it responds.
Are there some circumstances where the hint mechanism is needed? say the light does actually change, but you get a communications error, so you assume the command worked? Just wondering what the hint mechanism is for.
I would suggest removing the hint mechanism in favor of real-time updates.
Thanks for the input. Sorry I haven't had much time to dig in to this yet, but I hope to in the next few weeks.
I will play with turning off the hint again. I tried it on the original take of v2, but I hated that the toggles would 'bounce' in Home Assistant's Lovelace interface. Hopefully things are a bit quicker in V3, but I may also just try reworking the hint mechanics all together to provide an internal 'pending' status that doesn't trigger the events.
Are there some circumstances where the hint mechanism is needed?
Expanding on this... The biggest reason I think that hint was added by @rmkraus originally would be interface lag. You click a toggle, it should go to ON, but the status hasn't updated back from the ISY so the interface moves it back to OFF, then the ISY updates the status so it turns to ON. This is the 'bounce' I mentioned above.
This does have its pitfalls for when the ISY doesn't actually do what you commanded and your interface is out of sync.
This would be easier if the ISY's response to the REST command told you if the end device actually responded to the command, then you could just watch for the response, and update accordingly. However an 200 OK
response just tells you the ISY accepted the command and is processing it/sending it the devices.
This is where I think there's some room for a pending status/command timeout structure to take the place of the hint. Basically we'd assume the status changes on a successful command to the ISY (200 OK
), but do a better job of watching for communication errors reported by the ISY, and revert the status if it failed.
I actually implemented a retry mechanism. You send the command, and add it to a pending_commands dictionary. If the change state event comes in, the state is published and the command removed from the pending_command dict. If the state has not updated after 5 seconds, the command is resent. There are 5 retries before I give up.
Nick Waterton P.Eng Sent from my iPhone
On Dec 18, 2020, at 8:20 PM, shbatm notifications@github.com wrote:
Are there some circumstances where the hint mechanism is needed?
Expanding on this... The biggest reason I think that hint was added by @rmkraushttps://github.com/rmkraus originally would be interface lag. You click a toggle, it should go to ON, but the status hasn't updated back from the ISY so the interface moves it back to OFF, then the ISY updates the status so it turns to ON. This is the 'bounce' I mentioned above.
This does have its pitfalls for when the ISY doesn't actually do what you commanded and your interface is out of sync.
This would be easier if the ISY's response to the REST command told you if the end device actually responded to the command, then you could just watch for the response, and update accordingly. However an 200 OK response just tells you the ISY accepted the command and is processing it/sending it the devices.
This is where I think there's some room for a pending status/command timeout structure to take the place of the hint. Basically we'd assume the status changes, but do a better job of watching for communication errors reported by the ISY, and revert the status if it failed.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/automicus/PyISY/issues/145#issuecomment-748396796, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYJIZ6UWVXMESXLJHONUNLSVP5S5ANCNFSM4UQWYYVQ.
Unsubscribe from GE's commercial electronic messages: http://sc.ge.com/*casl-unsubscribe Désabonner des messages électroniques commerciaux de GE: http://sc.ge.com/*lcap-desabonnement
I'll take a look at your branch when I get time to play around.
An expanded command queue with retries for both REST call and comms errors is a good idea.
I’ll post it somewhere, I don’t think it’s up yet, I just subclass pyisy.
Nick Waterton P.Eng Sent from my iPhone
On Dec 18, 2020, at 8:36 PM, shbatm notifications@github.com wrote:
I'll take a look at your branch when I get time to play around.
An expanded command queue with retries for both REST call and comms errors is a good idea.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/automicus/PyISY/issues/145#issuecomment-748399022, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACYJIZ6WCILIG2X2MVF7K5TSVP7Q3ANCNFSM4UQWYYVQ.
Unsubscribe from GE's commercial electronic messages: http://sc.ge.com/*casl-unsubscribe Désabonner des messages électroniques commerciaux de GE: http://sc.ge.com/*lcap-desabonnement
My implementation is here https://github.com/NickWaterton/ISY_MQTT/tree/main
As I say it just subclasses pyisy (but in my copy of pyisy the hint mechanism is disabled)
@NickWaterton - I opened a new PR for the V3.x fix, which just removes the hint feature entirely and relies on the ISY sending an update in the event stream (or a manual node.update()
call) to update the status of the node.
@bdraco - We talked about this a while back on Discord. Seems like a net zero effect in Home Assistant, Lovelace now has debounce on the UI toggles so everything seems to work fine without the hint. If a command fails from the ISY to the end device, the toggle will eventually (~3s) fall back to the old status (similar to my Zigbee2mqtt devices when they fail). Let me know if you think the subsequent communication error should be elevated to an exception to generate a toast message, right now it's just an error and an ERR
attribute will appear in the aux_properties
.
If the action the user is performing fails, we should raise an exception to feed back to the UI.
If the action fails later we don't really have a way to feed it back to the user except logging.
I think its fine to remove the optimistic state change since the websocket is fast enough to deliver the event back to the state machine.
XREF: shbatm/hacs-isy994#85 cc: @rccoleman
Thanks for the pointer to this discussion. It does seem like the retry mechanism that you're discussing would help with the issue that I'm seeing, but unrelated to the issue above. The typical case for me is that turning on a bunch of lights causes the ISY to sporadically respond with "404" (not found) errors via REST, resulting in the light not being turned on and the error mentioned in that issue. It's been happening forever with no indication that UDI will change the behavior, and it seems like an inappropriate error message to indicate that the Insteon network is "busy" (as UDI has diagnosed). In any case, I look forward to having commands retried if they fail.
Closing with new issue #184 for adding enhancement for retry commands and #154 to fix this original issue.
This looks to be true of the 2.x version as well, but I'm on 3.x.x.
When
auto_update
is enabled, oruse_websocket
is enabled, you get two notifications for every command sent. This doesn't happens withoutauto_update
enabled, because thenode.status
is not updated with thehint
.When you send a command eg
node.turn_on()
, and you are subscribed tostatus_events
for that node, you get twochanged
events triggered.This is because the
turn_on()
command updates thenode.status
with thehint
value, which triggers a notify event fromnodebase.py
:Then you get a second trigger when the actual device changes to the new state, and you get an ST notification, which triggers a
node.update_state
This triggers another notification event, because updating
node.status
does not updatenode._formatted
, soupdate_state
findsstate.formatted != self._formatted
isTrue
and triggers anotherstatus_events.notify
.I believe the error in in
node.py
async def update
which is supposed to updateself._status
notself.status
.ie:
Which works as expected.