Closed alexruffell closed 2 years ago
Having exactly the same problem on version core-2021.11.5. When using emulated Hue I had no problems like this, may have to cancel Nabu Casa and go back to Hue.
I had exactly the same problem with NabuCasa.
This is why I switched to alexa integration. After switching, I have not had any unresponsive devices. It also works much faster than before. And it is cheaper.
The only disadvantages:
Same issue nearly identical setup to others in the linked thread. I have found a method to reduce the frequency. I created a webhook based scene on HA and make a curl request to it from a Linux vm every one minute which appears to reduce the frequency of the issue somewhat after about two weeks of testing and if course the amount of family / spousal noise has been reduced. 😊
cloud documentation cloud source (message by IssueLinks)
Hey there @home-assistant/cloud, mind taking a look at this issue as it has been labeled with an integration (cloud
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
Same issue here! It’s extremely annoying and seems to be getting worse.
Same identical issue here... and this happens since long long time /about 1 year and half... No solutions till now unfortunately.
I have the same issue. Worked fine for nearly 2 year. Now this issue gets worse every day.
Same issue here, never had any real issues and have been using NabuCasa for over a year. Now i get same as OP, i can also see ‘device not responsive’ within the device list using Alexa app. Issuing a command or refreshing resolves the glitch and returns to ‘responding’
How many devices do you all have exposed to Alexa? Last I checked I had 220. Could it be a timeout related to a high number of devices? I have had even more exposed without this issue but maybe something changed at Amazon or Nabu Casa
How many devices do you all have exposed to Alexa? Last I checked I had 220. Could it be a timeout related to a high number of devices? I have had even more exposed without this issue but maybe something changed at Amazon or Nabu Casa
I have less than 60 exposed to Alexa
I had a similar thought and reduced the amount from 112 to 88 and the issue remains. The only 'relief' I have found that appears to help for reasons unknown is a 'keep-alive' webhook automation I call from an Alexa routine every 30 minutes. The only reason I know it helps is there has been far less complaining from the spouse and kids since I put it into place when the issue began to manifest.
I agree with the others that this feels like a cold start issue with the compute host for Nabu Casa (Lamba? ECS/EKS?) or a rate limit issue on one or both sides.
-Ken
From: aclock81 @.> Sent: Tuesday, November 30, 2021 8:36 AM To: home-assistant/core @.> Cc: Kenneth Kilty @.>; Comment @.> Subject: Re: [home-assistant/core] Alexa, integrated via Nabu Casa, often says "Device is not responding" but then executes or requires command again (Issue #60250)
How many devices do you all have exposed to Alexa? Last I checked I had 220. Could it be a timeout related to a high number of devices? I have had even more exposed without this issue but maybe something changed at Amazon or Nabu Casa
I have less than 60 exposed to Alexa
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhome-assistant%2Fcore%2Fissues%2F60250%23issuecomment-982643186&data=04%7C01%7C%7Cb5902cb8c23040680a4408d9b406654a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637738761831591492%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=LBPmmG%2FwtH9oniMvvQXw8LQatj5XDsMLS6csH1vunWs%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB2H7WH2YGWVY5A4ISVWIF3UOTHNJANCNFSM5IUVL6ZA&data=04%7C01%7C%7Cb5902cb8c23040680a4408d9b406654a%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637738761831591492%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ZH%2F%2FHJg874cvK5%2FII0ocW7lwfT0XZmJoiJ0tKRKzqM4%3D&reserved=0.
I have only 35 entities exposed and it's been happening to me too since at least 2021.5.0. At least that's when I set up the cloud integration for the first time. Has anyone tried to collect some logs like this yet?
logger:
default: info
logs:
hass_nabucasa: debug
homeassistant.components.cloud: debug
@mklenk I added the last two I was missing. Where do I get the logs? Under Supervisor / Core?
@KenKilty Sounds like installs with 35 to 220 are having the :device is not responding" issue so I am guessing it has nothing to do with number of devices exposed to Alexa. I don't know how to implement the keep-alive webhook but would be willing to try.
However, it would be nice to hear from someone at Nabu Casa on this issue. They have over 1000 tickets open so it may be a while :( Maybe we could ping superstar @frenck (sorry I know you are busy...) in case he knows who owns this...
@mklenk I added the last two I was missing. Where do I get the logs? Under Supervisor / Core?
Depends on your setup: https://www.home-assistant.io/integrations/logger/ (at the bottom). Btw. I just stole that snippet from https://github.com/home-assistant/core/issues/26437#issuecomment-536856280 so I'm not sure if those are even the correct logs.
I'm currently in the middle of a zigbee network migration. Once I've finished that, I can try to take a look myself too.
We've been trying to get to the bottom of this but have not been able to figure it out yet. We've made some tweaks that made some things better for some, but we still hear that people having issues.
@balloob Please let us (us in this thread) know what we can do to help troubleshoot the issue. It is very disruptive to family acceptance, and Alexa gets yelled at a lot ;) The issue is very repeatable and some have found that it is related to a "keep alive" (see above). I wonder whether other factors such as VLANs, Pi-Hole, or more exotic networking setups/settings may be contributing (shooting in the dark here). I am glad to follow any suggestions you may have to help debug this issue.
Edit: at the very least, knowing what matters in this type of issue would help. Is networking setup/settings even relevant? Number of Echo devices? etc
I don't think it has anything to do with DNS and thus Pihole or the like. I use NextDNS (ex pihole user here).
Aggregate of all DNS queries in the past 30 days below from HA... Zero blocked requests using the default NextDNS block and tracker lists. Alexa calls seem abnormally high. Looking back farther there does appear to be a gradual increase in total calls. I have state reporting enabled and I've disabled noisy devices like smart plugs reporting power stats. I also run pfsense so maybe I'll try overriding the TTL to make the queries quicker.
Root Domains Aggregate of all queries made for root domains and all their subdomains. .amazon.com 30,344 .amazonalexa.com 24,920 .home-assistant.io 24,691 .ecobee.com 19,914 .myq-cloud.com 14,912 .co2signal.com 7,967"
Blocked Domains Domains blocked by a Security, Privacy and/or Parental Control setting or because they were manually denied. No domains yet.
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: alexruffell @.> Sent: Tuesday, November 30, 2021 1:58:43 PM To: home-assistant/core @.> Cc: Kenneth Kilty @.>; Mention @.> Subject: Re: [home-assistant/core] Alexa, integrated via Nabu Casa, often says "Device is not responding" but then executes or requires command again (Issue #60250)
@balloobhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fballoob&data=04%7C01%7C%7C4ca7eb0558c8465bd5c708d9b4336ef3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637738955262962316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ymS6ombobvw4akTnOwyEAHQewGgtKgVL9JLFnv7Z9b0%3D&reserved=0 Please let us (un in this thread) know what we can do to help troubleshoot the issue. It is very disruptive to family acceptance, and Alexa gets yelled at a lot ;) The issue is very repeatable and some have found that it is related to a "keep alive" (see above). I wonder whether other factors such as VLANs, Pi-Hole, or more exotic networking setups/settings may be contributing (shooting in the dark here). I am glad to follow any suggestions you may have to help debug this issue.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhome-assistant%2Fcore%2Fissues%2F60250%23issuecomment-982925955&data=04%7C01%7C%7C4ca7eb0558c8465bd5c708d9b4336ef3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637738955262962316%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=pVofH8PwSlDAvfh4tErO3UZBk8f93kOqCmR9Ke4PZg4%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB2H7WHZZUQGZCZCUSXMCGLUOUNGHANCNFSM5IUVL6ZA&data=04%7C01%7C%7C4ca7eb0558c8465bd5c708d9b4336ef3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637738955262972306%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=RrrXXylgR7PRDlAsEgScdVESekS4H43bVEFTjm9nZ4Q%3D&reserved=0.
We think it is due to communication between Alexa and Home Assistant Cloud servers. The report state is about sending state updates, not about responding to Alexa.
@balloob If the glitch were solely due to communication issues between Alexa (I assume you are referring to Amazon's Alexa servers) and Home Assistant Cloud servers, then wouldn't this issue affect 100% of Nabu Casa subscribers using the NC to integrate Alexa? If it doesn't affect everyone, then it might help to figure out what the affected parties have in common. Just brain storming...
I've been experiencing this issue quite a bit lately, too. My guess was that Nabu Casa might be using a serverless infrastructure like AWS Lambda, and that workers might need to be spun up to handle requests. That would explain the initial delay, and also why making keep-alive requests might help (by preventing request handlers from sleeping).
I've never seen Lambda take multiple seconds to spin up.
I'm also facing a huge delay between command to Alexa and action. Chech this video made few minutes ago: https://youtu.be/BU01SiYzhos I've never faced such problem in the past... If I manually switch on lights from Home Assistant interface, all run in real time without delays
@jason0x43 Today I was trying to cause the issue and after turning on/off a number of devices I ran into one (upstairs bathroom light) which caused the issue to happen a couple times in several attempts. I got the detailed log output thanks to @mklenk . I don't know what the output reveals so I won't post it here, but I would be glad to share it with @balloob if it can help. Oddly, the same device has a higher occurrence of this issue but it may be a mere coincidence.
EDIT: Adding the log output for what I believe to be the log for a single failed command (turn off upstairs light). I removed the tokens, and IDs. Is ithis of any use? If so, I can cause the issue and collect more logs.
`2021-12-01 09:29:19 DEBUG (MainThread) [hass_nabucasa.iot] Received message:
{'handler': 'alexa',
'msgid': 'msgid1-removed',
'payload': {'directive': {'endpoint': {'cookie': {},
'endpointId': 'light#upstairs_bathroom_light',
'scope': {'token': 'token_removed',
'type': 'BearerToken'}},
'header': {'correlationToken': 'token_removed',
'messageId': 'removed',
'name': 'TurnOn',
'namespace': 'Alexa.PowerController',
'payloadVersion': '3'},
'payload': {}}}}
2021-12-01 09:29:19 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message:
{'msgid': 'msgid1-removed',
'payload': {'context': {'properties': [{'name': 'powerState',
'namespace': 'Alexa.PowerController',
'timeOfSample': '2021-12-01T15:29:19.0Z',
'uncertaintyInMilliseconds': 0,
'value': 'OFF'},
{'name': 'brightness',
'namespace': 'Alexa.BrightnessController',
'timeOfSample': '2021-12-01T15:29:19.0Z',
'uncertaintyInMilliseconds': 0,
'value': 0},
{'name': 'connectivity',
'namespace': 'Alexa.EndpointHealth',
'timeOfSample': '2021-12-01T15:29:19.0Z',
'uncertaintyInMilliseconds': 0,
'value': {'value': 'OK'}}]},
'event': {'endpoint': {'cookie': {},
'endpointId': 'light#upstairs_bathroom_light',
'scope': {'token': 'token_removed',
'type': 'BearerToken'}},
'header': {'correlationToken': 'token_removed',
'messageId': 'removed',
'name': 'Response',
'namespace': 'Alexa',
'payloadVersion': '3'},
'payload': {}}}}
2021-12-01 09:29:21 DEBUG (MainThread) [hass_nabucasa.iot] Received message:
{'handler': 'alexa',
'msgid': 'msgid2_removed',
'payload': {'directive': {'endpoint': {'cookie': {},
'endpointId': 'light#upstairs_bathroom_light',
'scope': {'token': 'token_removed',
'type': 'BearerToken'}},
'header': {'correlationToken': 'token_removed',
'messageId': 'removed',
'name': 'ReportState',
'namespace': 'Alexa',
'payloadVersion': '3'},
'payload': {}}}}
2021-12-01 09:29:21 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message:
{'msgid': 'msgid2_removed',
'payload': {'context': {'properties': [{'name': 'powerState',
'namespace': 'Alexa.PowerController',
'timeOfSample': '2021-12-01T15:29:21.0Z',
'uncertaintyInMilliseconds': 0,
'value': 'OFF'},
{'name': 'brightness',
'namespace': 'Alexa.BrightnessController',
'timeOfSample': '2021-12-01T15:29:21.0Z',
'uncertaintyInMilliseconds': 0,
'value': 0},
{'name': 'connectivity',
'namespace': 'Alexa.EndpointHealth',
'timeOfSample': '2021-12-01T15:29:21.0Z',
'uncertaintyInMilliseconds': 0,
'value': {'value': 'OK'}}]},
'event': {'endpoint': {'cookie': {},
'endpointId': 'light#upstairs_bathroom_light',
'scope': {'token': 'token_removed',
'type': 'BearerToken'}},
'header': {'correlationToken': 'token_removed',
'name': 'StateReport',
'namespace': 'Alexa',
'payloadVersion': '3'},
'payload': {}}}}`
Is our issue possibly related to this HomeBridge issue:
https://github.com/NorthernMan54/homebridge-alexa/issues/459
It doesn't sound identical but also not too different. There are many comments on cloud tweaks (keepalive, etc) to mitigate the issue.
@aclock81 I would add the extra debugging logs temporarily as detailed by this post:
https://github.com/home-assistant/core/issues/60250#issuecomment-982699687
What I would be curious about is to confirm whether the delay is in sending the command (audio clip), getting the response from Alexa, or in HASS acting on it. The delays I have are random... sometimes everything is lightning fast, other times it is very slow and I get the annoying "device is not responding message" right when the device then switches on/off. You are not getting the error, and from your comment/video it appears your delay is consistent. If so, it could be a different issue.
(saluti! vivo in USA ma sono cresciuto a Roma!)
@aclock81 I would add the extra debugging logs temporarily as detailed by this post:
What I would be curious about is to confirm whether the delay is in sending the command (audio clip), getting the response from Alexa, or in HASS acting on it. The delays I have are random... sometimes everything is lightning fast, other times it is very slow and I get the annoying "device is not responding message" right when the device then switches on/off. You are not getting the error, and from your comment/video it appears your delay is consistent. If so, it could be a different issue.
(saluti! vivo in USA ma sono cresciuto a Roma!)
(Ciao caro Alex, saluti dalla ventosa Trieste! :) ) Sorry for my late reply but I was very busy and I had no time to check. So, I've exactly your identical issue: sometimes all is running perfectly and sometime there ar huge delays and/or errors. It is clearly the same issue that users had with home bridge (solved with shorter keep alive protocol).
I paste here some logs (censored tokens of course).
The below is the command "accendi luci zona bar" which commands 3 Zigbee Sonoff Mini + 1 Wi-Fi Tasmota Firmware Sonoff Mini, all installed in the wall which are piloting 4 lights. Usually this command had strong delay, but this time runned normally (1 or 2 seconds delay which is definitely acceptable):
**2021-12-07 12:00:26 DEBUG (MainThread) [hass_nabucasa.iot] Received message: {'handler': 'alexa', 'msgid': 'removed', 'payload': {'directive': {'endpoint': {'cookie': {}, 'endpointId': 'switch#lucespecchio', 'scope': {'token': REMOVED, 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'TurnOn', 'namespace': 'Alexa.PowerController', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:26 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message: {'msgid': 'REMOVED', 'payload': {'context': {'properties': [{'name': 'powerState', 'namespace': 'Alexa.PowerController', 'timeOfSample': '2021-12-07T11:00:26.0Z', 'uncertaintyInMilliseconds': 0, 'value': 'OFF'}, {'name': 'connectivity', 'namespace': 'Alexa.EndpointHealth', 'timeOfSample': '2021-12-07T11:00:26.0Z', 'uncertaintyInMilliseconds': 0, 'value': {'value': 'OK'}}]}, 'event': {'endpoint': {'cookie': {}, 'endpointId': 'switch#lucespecchio', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'Response', 'namespace': 'Alexa', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:26 INFO (SyncWorker_14) [homeassistant.components.command_line.switch] Running command: /usr/bin/curl -X POST http://192.xxx.xxx.xx/cm?cmnd=Power%20On 2021-12-07 12:00:26 INFO (SyncWorker_14) [homeassistant.components.command_line.switch] Running state value command: /usr/bin/curl -X GET http://192.xxx.xxx.xx/cm?cmnd=Power 2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Received message: {'handler': 'alexa', 'msgid': 'REMOVED', 'payload': {'directive': {'endpoint': {'cookie': {}, 'endpointId': 'switch#luce_porta_secondaria', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'TurnOn', 'namespace': 'Alexa.PowerController', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message: {'msgid': 'REMOVED', 'payload': {'context': {'properties': [{'name': 'powerState', 'namespace': 'Alexa.PowerController', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': 'OFF'}, {'name': 'connectivity', 'namespace': 'Alexa.EndpointHealth', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': {'value': 'OK'}}]}, 'event': {'endpoint': {'cookie': {}, 'endpointId': 'switch#luce_porta_secondaria', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'Response', 'namespace': 'Alexa', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Received message: {'handler': 'alexa', 'msgid': 'REMOVED', 'payload': {'directive': {'endpoint': {'cookie': {}, 'endpointId': 'switch#bancone', 'scope': {'token': 'REMOVED'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'TurnOn', 'namespace': 'Alexa.PowerController', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message: {'msgid': 'REMOVED', 'payload': {'context': {'properties': [{'name': 'powerState', 'namespace': 'Alexa.PowerController', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': 'OFF'}, {'name': 'connectivity', 'namespace': 'Alexa.EndpointHealth', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': {'value': 'OK'}}]}, 'event': {'endpoint': {'cookie': {}, 'endpointId': 'switch#bancone', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'name': 'Response', 'namespace': 'Alexa', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Received message: {'handler': 'alexa', 'msgid': 'REMOVED', 'payload': {'directive': {'endpoint': {'cookie': {}, 'endpointId': 'switch#tavolo', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'messageId': 'REMOVED', 'name': 'TurnOn', 'namespace': 'Alexa.PowerController', 'payloadVersion': '3'}, 'payload': {}}}}
2021-12-07 12:00:28 DEBUG (MainThread) [hass_nabucasa.iot] Publishing message: {'msgid': 'REMOVED', 'payload': {'context': {'properties': [{'name': 'powerState', 'namespace': 'Alexa.PowerController', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': 'OFF'}, {'name': 'connectivity', 'namespace': 'Alexa.EndpointHealth', 'timeOfSample': '2021-12-07T11:00:28.0Z', 'uncertaintyInMilliseconds': 0, 'value': {'value': 'OK'}}]}, 'event': {'endpoint': {'cookie': {}, 'endpointId': 'switch#tavolo', 'scope': {'token': 'REMOVED', 'type': 'BearerToken'}}, 'header': {'correlationToken': 'REMOVED', 'messageId': 'REMOVED', 'name': 'Response', 'namespace': 'Alexa', 'payloadVersion': '3'}, 'payload': {}}}} 2021-12-07 12:00:33 INFO (SyncWorker_19) [homeassistant.components.command_line.switch] Running state value command: /usr/bin/curl -X GET http://192.xxx.xxx.xx/cm?cmnd=Power 2021-12-07 12:00:33 INFO (SyncWorker_28) [homeassistant.components.command_line.switch] Running state value command: /usr/bin/curl -X GET http://192.xxx.xxx.xx/cm?cmnd=Power 2021-12-07 12:00:34 WARNING (MainThread) [homeassistant.helpers.template] Template variable warning: 'dict object' has no attribute 'power_outage_memory' when rendering '{{ value_json.power_outage_memory }}'**
I really hope we can find a way to keep alive on 24hrs basis our Alexa vocal commands. All the automatisms related to the Home Assistant system (like PIR + light activation on the stairs), warnings, etc... are working perfectly and without delays. I have also a washing machine and a drier with 2 Zigbee plugs, so when the washing or drying cycle is finished, a vocal message to all Alexa speakers is playing. I have never had a failure in this, all is running perfectly. Maybe this can help to find the problem.
Again, thanks all for the help you can provide. My wife is threatening to trash it all away if I don't resolve this, and she's not completely wrong, especially when my little daughter is asleep and in the middle of the night the Alexa Speaker warns that it can't reach the device. This is really frustrating!
Is our issue possibly related to this HomeBridge issue:
NorthernMan54/homebridge-alexa#459
It doesn't sound identical but also not too different. There are many comments on cloud tweaks (keepalive, etc) to mitigate the issue.
@alexruffell - The root cause of this issue with homebridge-alexa is the tuning of connection between the plugin and the homebridge cloud servers that I did a few months ago to reduce the load on the MQTT server. I have found that the MQTT clients were sending MQTT keepalives every minute, and if you have thousands of clients you end up DDOS'ing the server. The discussion in that issue was about adjusting the MQTT keep alive setting to work around peoples Firewall/Routers closing open inactive TCP sessions after about 10 minutes. Not sure if this detail helps, but my guess is that it is likely not related.
@NorthernMan54 Thank you for the details. Hopefully @balloob can review them to see if they have anything to do with out issue. In summary Alexa says "device not responding" even though she ends up executing the command, or in some cases, she does so immediately after issuing the same command again after her error message.
Coincidentally, yesterday (after wiping out my 200+ devices using alexa.amazon.com, disabling/re-enabling skill and discovering them again - did this as the app was telling me that I had to enable the skill even though it already was), I noticed that the device not responding error was now given on the screen of my Show. I had not noticed that before. The result is less frustration when it actually ends up working but makes me wonder whether someone is working to address this issue at amazon. Hope is last to die ;)
Just a small update: I didn't touch anything in my network setting or in Home Assistant and, since 2 days, no errors and no delays. All is running perfectly. What is situation with you guys? Did anybody in Nabu Casa team change some settings maybe?
@aclock81 Same here but I noticed that my Echo Show says device not responding sometimes... it just doesn't say it anymore. Overall it seems to be happening less frequently but hard to say as I am rarely looking at the Echo screen if the one picking up my command even has one. About 2 days ago I did have some Alexa issues (amazon app kept telling me the home assistant skill was not enabled even though it was) so I deleted all my devices using alexa.amazon.com, disabled the skill, disabled the integration and then re-enabled everything.
Hi dear @alexruffell I'm glad to know that your system behaves in a similar way so most likely this is a server side problem, not depending on our configurations. I really hope somebody is working on it this annoying problem will be solved once and for all. I hope also other users will give their feedback on this just to be 100% sure this issue is not depending on individual configuration.
@aclock81 Mine was working well for a day or maybe two, but now back to saying device isn't responding. I do think it's become less frequent though, so I'm holding out hope that it's at least being worked on even though not acknowledged (that I'm aware of).
@alexruffell - After watching this discussion, is your Alexa device constantly checking the device status ? One behaviour I have noticed with my homebridge-alexa service is that a small subset of users are constantly requesting device status updates, and on the cloud server it looks like this
Those spikes are caused by a handful of users requesting device status for all their devices in sequence.
@NorthernMan54 I have no idea how to check that or even trigger it to keep polling and/or sync programmatically. I am guessing that the frequency is determined by Nabu Casa. All I can do is select what entities are sync'd and trigger a manual sync. I have 222 exposed and 1282 that are hidden. I can likely reduce it further but not by a significant number.
Maybe @balloob can comment on the frequency and what drives it.
@alexruffell The state reporting frequency is controlled by Amazon and not by the client, so not much you can do. But I wonder if its a Alexa device setting or config triggering it. I don't have any show devices, but was wondering if it has setting that triggers repeated the checks.
@NorthernMan54 With over 200 entities exposed to amazon I am guessing that my system reports state to amazon frequently. Unless they are rate limiting me, this should keep the connection alive and I should therefore not get the "unresponsive device error", right?
I looked around the Alexa Show settings and its settings are super boring and none seem to have anything to do with state reporting.
Same issue nearly identical setup to others in the linked thread. I have found a method to reduce the frequency. I created a webhook based scene on HA and make a curl request to it from a Linux vm every one minute which appears to reduce the frequency of the issue somewhat after about two weeks of testing and if course the amount of family / spousal noise has been reduced. 😊
Any details on this please? :)
In HA configuration -> automations you create a new scene automation triggered by 'webhook' . Give the screen a name like 'Keep Nabu Casa Warmed Up' , in the attached image you see 'keep-alive' and you'll see a new webhook URI in the home assistant cloud configuration. The GUI Ux does the work. From there sync entities with Alexa to get the scene to show up in Alexa. From the Alexa call this scene via a routine that runs regularly during wharves hours your users are sensitive to it failing. 😁 It's not a perfect fix but does seem to keep either code on the client HA raspPi warned up or the Nabu Casa service side endpoint or both. Interestingly in the logs the invocation time of this do nothing scene automation seems to vary widely. I think this is a symptom of whatever the real underlying issue really is but I'm not sure. However I've never hit one has this scene invocation fail which is also something I don't understand unless the code path for a voice command and scheduled scene invocation are really different.
On a related topic, are there any HA client metrics (not logs text, scaler stuff like metrics) available to hook into? I would love to send it to a TSDB and see what the trends are across the various subsystems. Something like this https://pypi.org/project/applipy-metrics/
From: monotonus @.> Sent: Friday, December 17, 2021 9:01:41 AM To: home-assistant/core @.> Cc: Kenneth Kilty @.>; Mention @.> Subject: Re: [home-assistant/core] Alexa, integrated via Nabu Casa, often says "Device is not responding" but then executes or requires command again (Issue #60250)
Same issue nearly identical setup to others in the linked thread. I have found a method to reduce the frequency. I created a webhook based scene on HA and make a curl request to it from a Linux vm every one minute which appears to reduce the frequency of the issue somewhat after about two weeks of testing and if course the amount of family / spousal noise has been reduced. 😊
Any details on this please? :)
— Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhome-assistant%2Fcore%2Fissues%2F60250%23issuecomment-996744870&data=04%7C01%7C%7C2932bc04f30e42d54c8f08d9c165c071%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637753465034239001%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=pT%2BRyVEF0vIwbJjQn2ift5pyo36Qpuy%2F59MAsOL1oo8%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAB2H7WF34K7TIWFA5354LQ3URM7ELANCNFSM5IUVL6ZA&data=04%7C01%7C%7C2932bc04f30e42d54c8f08d9c165c071%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637753465034239001%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=OmvuP%2FzGFBK9IDNvb7qBQutNR7c6ziRx4Olvf%2BTsZ7U%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>
Unfortunately there is no image attached in your post. And where is the linux VM gone? :-D Replaced by Alexa routine?
Yes I replaced the Curl request with an Alexa routine after reading the speculation from others that the issue might be with the Alexa service itself and it's interaction with Nabu. I'm on mobile so I edited the post above and hopefully the image is attached now.
@KenKilty unfortunately I see no images.
FWIW, I see the image...
Thank you, now I see it!
Dear @KenKilty, first of all thanks for your time. Unfortunately I'm not an expert/advanced user so, despite your detailed guide, I have no idea on how to implement this. If I setup automation like you did on the image, Home Assistant returns the following error message: "Message malformed: Integration '' not found". Are you able to do a mini-guide for dummies from A to Z? Again, thanks a lot for your help on this!
Hi @KenKilty Thanks so far. But what do you do as an action when using the webhook? And I'm not quite sure how to get this into alexa afterwards. Which scene do you mean? Wouldn't it be easier to just sync an input_boolean to alexa and toggle this hourly to make alexa do anything?
Hey @monotonus yes a input boolean would work too. Originally I used a curl request from Linux so I needed a Uri but since moving to a Alexa routine a scene isn't absolutely necessary. To your question the action is simply a wait for trigger. FWIW I'm more convinced the number of devices synced to Alexa is the core issue. The more I trim the list synced the better things appear to get overall.
I don't think my issue has gone because I sometimes have to wait quite a bit and/or repeat the command but it is no longer saying "device not responding". Anyhow, as mentioned in a previous post, I once saw the device not responding error flash on the screen of my echo show. In the areas where we give the most commands we have echo shows so I am guessing the error was made less aggravating by printing on screen (when available?). I need to sit on a room with a regular echo to see if I can get her to say the error to confirm this.
Related to the keep alive... I was able to replicate the issue by firing a sequence of commands back to back controlling different devices until it gave the error. Wouldn't this be equivalent to a keep alive? If so, there may be more to the problem.
Can you show the YAML please? Looking at your pictures makes no sense as it doesn't look like it actually does anything.
One change I did make (which may have cut down on my errors) recently in a separate effort to optimize my zwave mesh, was to split all the values reported by my house energy monitor into 3 groups that had a delay between them so that they would not all be in a queue to TX at once. If Alexa was receiving all these updates as well, I wonder whether it was causing some sort of rate limiting or race condition.
The HEM was spewing 12 values every 300 seconds. Now they are split in 3 uneven groups with 300, 330, and 360s delays.
The amount of data is tiny but I still wonder whether system activity may be to blame.
The problem
For a few months now, I often get the "Device is not responding" error when trying to control something via Alexa. Sometimes, as soon as she says that, the device does what it was supposed to but in many cases, you have to repeat the command. In 100% of the cases where a 2nd command is necessary, it is executed instantly.
I have seen many identical reports on the forums so I am not alone.
I don't have a copy of the error message from Core's log, but I had the impression that when I get this error, the log prints out something about not being able to access SniTun server and/or that there was some issue with the expected bits. However I just got the error message and there is nothing Alexa related in the logs, and/or any new logs when the error occurred.
One of the forum threads about this issue:
https://community.home-assistant.io/t/alexa-device-not-responding-on-first-try/326784
What version of Home Assistant Core has the issue?
core-2021.11.5
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Nabu Casa Alexa Integration
Link to integration documentation on our website
No response
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response