Closed KruseLuds closed 4 months ago
Hey there @balloob, @bieniu, @thecode, @chemelli74, @bdraco, mind taking a look at this issue as it has been labeled with an integration (shelly
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
shelly documentation shelly source (message by IssueLinks)
Please enable Debug for Shelly integration, wait for a device to become unavailable, disable debug and attach the log.
Note: it is better to drag the log into the comment (which will add it as an attachment) and not copy paste as it is hard to read logs in GitHub.
Thanks
(See the bottom most comment from me, that includes the log and what I found in it).
I am not sure if I can do that as turning on that debug logging makes my syslog go absolutelly nuts with many thousands of lines per minute (e.g., "[aioshelly.rpc_device.wsrpc]") as I have 41 WiFi shelly devices acting as passive bluetooth scanners for 11 Shelly BLU motion sensors. I will try however... (Y.I.K.E.S.)
Just as an FYI so you have the details here, I created an automation that reloads the config entries for any of my shelly motion 2 sensors when they become "unavailable" but it would run endlessly as I had it in parallel and 1000 (cuasing some kind of a race condition or loop I guess) but instead changed it to queued 30 and it works like a charm, also I can tell you that in checking traces I noticed:
@ 8:26:59pm it ran for the Bedroom 3 Shelly Motion 2 Sensor @ 8:27:00pm - Den Shelly Motion Sensor @ 8:27:00pm - Bathroom Shelly Motion 2 Sensor @ 8:27:00pm - Basement Stairs Shelly Motion 2 @ 8:27:01pm - Kitchen Main Shelly Motion Sensor
FYI, below yaml syntax is kludgy but works fine - as I wanted to use the trigger ID in the actual command for the reload to get rid of all of those if statements at the bottom but I couldn't get the syntax right. I will post the debug data here tomorrow -
`alias: Any Shelly Motion 2 Becomes Unavailable -> Reload It's Config Entry description: "" trigger:
Here is the HUGE log and I did check the automation which did show that the "Den Shelly Motion Sensor" went offline at 11:47:38pm and the automation reloaded it's config entry and it was then (and is still now) online. Also the details in home assistant about the device show this in the log as well (the red rectangle). Below that I have attached the log file:
The IP address for the Den Shelly Motion Sensor (which is a Shelly Motion 2) you should see in the log would be 192.168.10.53.
2024-06-06 23:47:38.589 ERROR (MainThread) [homeassistant.components.shelly] Error fetching Den Shelly Motion 2 data: Sleeping device did not update within 3600 seconds interval
...ended up triggering the automation:
homeassistant.components.automation.any_shelly_motion_2_becomes_unavailable_reload_it_s_config_entry
Pay attention to these times in the log file:
23:47:38.994 23:47:38.995 23:47:38.996 23:50:00.047
I had to remove a gazillion lines from the file so I got it from 457MB to 14MB so it now fits in 25MB upload limit):
[Uploading home-assistant_shelly_2024-06-07T04-07-00.181Z.log…]()
Thank you for your help, I look forward to hearing back (your fix will not make me disable that automation though, it is good insurance for me :-) )
FYI my automation still makes things go berserk sometimes, I had to disable it. So, HALP!
I have similar problem. Almost every day the Shellies became unavailable. That also affected my energy statistic (VERY annoying). It happens exactly same time stamps. I have tested different firmwares as well
Same here with a shelly pro 3EM. In the shelly app it is abailable all the time.
Same here with a shelly pro 3EM. In the shelly app it is abailable all the time.
Shelly Pro 3EM is using a different protocol and can't be the same, please create a new issue with logs. Thanks
as I have 41 WiFi shelly devices acting as passive bluetooth scanners for 11 Shelly BLU motion sensors.
Out of curiosity, do you have only motion sensors as gen1 devices ?
This is happening since 2024.5 already #116948
as I have 41 WiFi shelly devices acting as passive bluetooth scanners for 11 Shelly BLU motion sensors.
Out of curiosity, do you have only motion sensors as gen1 devices ?
No, I have almost 60 shelly devices, they are Gen 1 and Gen 2. The only one device type experiencing this issue is the Shelly Motion 2's (which are Gen 1: Hardware: gen1 (SHMOS-02)). There are the kinds of devices I have:
Shelly Motion 2 (12) Shelly Plus Plug (8) Shelly +1 (11) Shelly 1L (7) Shelly Pro 3EM (1) Shelly +2PM (1) Shelly BLU Motion (14) Shelly Dimmer 2 (3)
So in fact, every Shelly Motion device I have is Gen 2 - and FYI they are updated with latest released production (non-beta) firmware.
This is happening since 2024.5 already #116948
This might not be the same issue that I am having however, as I did in this thread post the list of my devices and counts for them, and of that list of device types this issue is only happening with the Shelly Motion 2's (it may include affecting one of your device types that I do not have however of course).
This is happening since 2024.5 already #116948
Tuis might bot be the same issue that I am having however, as I did in this thread post the list of my devices and counts for them, and of that list of device types this issue is only happening with the Shelly Motion 2's (it may include one of your device types that I do not have however of course).
I only have Shelly Motions
My Shelly Pro 3EM becomes available and unavailable without any manual actions of me. "It just happens" But it is available all the time in the shelly app with fresh data.
No, I have almost 60 shelly devices, they are Gen 1 and Gen 2. The only one device type experiencing this issue is the Shelly Motion 2's (which are Gen 2).
Shelly Motion 2 (12) Shelly Plus Plug (8) Shelly +1 (11) Shelly 1L (7) Shelly Pro 3EM (1) Shelly +2PM (1) Shelly BLU Motion (14) Shelly Dimmer 2 (3)
so you have 3 types of gen1 devices:
Yes @chemelli74 I stand corrected, all of my Shelly Motion 2's ARE Gen 1:
Hardware: gen1 (SHMOS-02)
We suspect that the problem may be caused by blocking the event loop by another integration (probably custom one). The CoIoT packet with status reaches the HA server but cannot be processed correctly. To check this, please enable HA built-in debug mode, restart HA and attach here the log file after few hours.
I am having the same issue, since upgrading to 2024.6.x all my Shelly Motion devices go regularly offline, only reloading them helps. Also, Shelly Smoke devices constantly report expired credentials, which is probably another issue with Shelly integration. 😭
@bieniu I already attached the log, what is the status?
Where is the log?
I also have multiple Shelly devices, and I have (I think) the same issue with the motion 2 but not with the others. In my case, the motion 2 starts going offline shortly after restarting HA and remains unstable for several hours, requiring multiple integration reloads. However, after a few hours, the motion 2 stabilizes and remains reliable until the next restart of HA.
I also have multiple Shelly devices, and I have (I think) the same issue with the motion 2 but not with the others. In my case, the motion 2 starts going offline shortly after restarting HA and remains unstable for several hours, requiring multiple integration reloads. However, after a few hours, the motion 2 stabilizes and remains reliable until the next restart of HA.
Yes this drives me crazy. I have similar with Shelly 3EM. Do you have Unifi wifi?
Yes, I have Unifi WiFi, but that hasn't changed recently. Like the OP, my issue with the Shelly integration started in May, although I think it was earlier than 2024.5.5.
Yes, I have Unifi WiFi, but that hasn't changed recently. Like the OP, my issue with the Shelly integration started in May, although I think it was earlier than 2024.5.5.
I have shelly motions doing the same and not have Unifi, I am using Asus ZenWifi routers so I doubt that is the problem. Besides that the devices stay connected to Wi-Fi and are reachable through IP in web browser.
Where is the log?
Scroll up, I uploaded the file! It looks like this:
First, the link is broken. Besides that I asked about the log with enabled the asyncio debug mode. I doubt you would post such a log before I asked for it.
@bieniu, I have the same issue. I just enabled HA debug logging and this is the log file: home-assistant_2024-06-24T11-28-57.878Z.log
I have 5 Shelly Motion2 devices and they randomly become unavailable, but usually couple of them almost at the same time. Like in the attached log file:
2024-06-24 13:22:20.123 ERROR (MainThread) [homeassistant.components.shelly] Error fetching shelly_ruch_jadalnia data: Sleeping device did not update within 3600 seconds interval
2024-06-24 13:22:21.211 ERROR (MainThread) [homeassistant.components.shelly] Error fetching shelly_kuchnia_ruch data: Sleeping device did not update within 3600 seconds interval
2024-06-24 13:23:52.282 ERROR (MainThread) [homeassistant.components.shelly] Error fetching shelly_parter_hall_ruch data: Sleeping device did not update within 3600 seconds interval
Also, please find the diagnostic logs for one of those devices: config_entry-shelly-78d71ea582fecd1dbf26fe814675ee08.json
From my perspective, I don't see any integrations that could block the event loop around the time the devices became unavailable. In my case there is one custom integration (smartir
) that can block the event loop (see the log file), but I think the only time it reads files using with open()
is while setting up (it gets the IR codes from configuration files). And looking at logs it happened at 12:10, then everything was just fine and then, at 13:22 3 devices became unavailable.
@anybody84 As far as I know, blitzortung
also blocks the event loop. Could you test with HA in safe mode (disabled all custom integrations)?
And show us a screenshot of the unicast configuration for Motion 2.
@bieniu, is this what you meant?
I just restarted HA in safe mode. I will post the log file separately, when it happens again (later today, I guess).
is this what you meant?
Yes. I assume that this hidden IP address is the address of your HA server.
Yes, of course. IP address of the host machine HA is running on.
I have the same issue. I have many different kinds of shellys and only my shelly motion 2 devices are affected. The description of the problem that @KruseLuds provides matches very much my observations. I did not change my home assistant set-up for quite some time, but I keep home assistant up to date. The issue started to appear in the past few weeks, e.g. with home assistant 2024.5 or 2024.6
@anybody84 As far as I know,
blitzortung
also blocks the event loop. Could you test with HA in safe mode (disabled all custom integrations)?And show us a screenshot of the unicast configuration for Motion 2.
No blitzortung usage here and no Unifi here and having exactly the same
@anybody84 As far as I know,
blitzortung
also blocks the event loop. Could you test with HA in safe mode (disabled all custom integrations)? And show us a screenshot of the unicast configuration for Motion 2.No blitzortung usage here and no Unifi here and having exactly the same
I’m not using blitzortung but still have the issue…
@rhoddan @smarthomefamilyverrips Your comments do not contribute anything to this discussion. Share logs with safe mode enabled or HA built-in debug mode enabled if you want to help.
@rhoddan @smarthomefamilyverrips Your comments do not contribute anything to this discussion. Share logs with safe mode enabled or HA built-in debug mode enabled if you want to help.
@bieniu It should help you because now you know that it is not related to "blitzortung" and also that Unifi Wifi not is causing it as some coments possible suggested. But sure if you prefer to spend your time following unlogical explanations then sorry that we try to help with what we can.... anyway this is already going on from the 2024.5 (see other issue mentioned) updates and was never a issue before, for now just a one time reload of the integration for the motion sensors after a HA restart solves the problem, so I will keep using this work around in a automation and not "bother" you anymore with "in your eyes useless" information to try to help in ways we are able to for our personal situations (maybe not all of us are in possibility to share logs, guess this thought never occurred to you)
@bieniu, just one more observation. I have multiple battery-powered Shelly devices like Shelly Motion 2
, Shelly Button 1
and Shelly Flood
. All devices are configured using CoIoT
protocol in the exact same way (I always copy-paste it from one device to another). But I noticed that Shelly Button 1
and Shelly Flood
devices work correctly and they don't become unavailable
. Somehow, the problem seems to be related only to Shelly Motion 2
devices.
For the record, this is the CoIoT
configuration for my Shelly Button 1
device:
@bieniu, just one more observation. I have multiple battery-powered Shelly devices like
Shelly Motion 2
,Shelly Button 1
andShelly Flood
. All devices are configured usingCoIoT
protocol in the exact same way (I always copy-paste it from one device to another). But I noticed thatShelly Button 1
andShelly Flood
devices work correctly and they don't becomeunavailable
. Somehow, the problem seems to be related only toShelly Motion 2
devices.For the record, this is the
CoIoT
configuration for myShelly Button 1
device:
This is my setting for my 3EM (not battery powered)
Another observation on my part.
Since HA version 2024.6, I have also had a noticeable number of interruptions on my Shelly Motion 2
. My WIFI hardware is Unifi. I have explicitly switched off the 5 GHz WIFI for my IOT network and operate it exclusively with the 2.4 GHz. Since the changeover, I have noticed no more interruptions and greater stability of the connection on my Shelly Motion 2
.
This is probably not the cause of the error, but could help as a workaround :-)
Firmware 2.2.4 has just been released for Motion/Motion 2 and TRV. The only point in the changelog is "Update WF200 firmware to a possible fix for powersave issue". Please update the firmware and report if it somehow helps with this problem.
The firmware upgrade does not fix this issue.
Also, I would like to point out that it is clear that the issue is not related to Wi-Fi--not the brand, eg Unifi, and not the frequency, eg turning off 5 ghz. At the time the device becomes unavailable, and throughout the time it is unavailable, it is still accessible directly by web browser, and it shows no errors.
Can you run tcpdump
on HA server ?
If so please do:
tcpdump -v -A host <IP> and udp port 5683
" where <IP>
is the ip address of Motion 2 device.Firmware 2.2.4 has just been released for Motion/Motion 2 and TRV. The only point in the changelog is "Update WF200 firmware to a possible fix for powersave issue". Please update the firmware and report if it somehow helps with this problem.
Giving it a shot -
Please create new issue if you still experience problems after updating to core 2024.7.0 (in beta now and released next week). Make sure to provide diagnostics and logs as explained in https://github.com/home-assistant/core/issues/119002#issuecomment-2153189138
Please create new issue if you still experience problems after updating to core 2024.7.0 (in beta now and released next week). Make sure to provide diagnostics and logs as explained in #119002 (comment)
This is still an issue, when I turn on the debugging my logs are lilke multiple gigs and then I can't upload them and struggle to whittle them down and then the upload to there for them fails. Please don't just sweeep this under the rug if others are saying they still have the issue as well.
This is still an issue, when I turn on the debugging my logs are lilke multiple gigs and then I can't upload them and struggle to whittle them down and then the upload to there for them fails. Please don't just sweeep this under the rug if others are saying they still have the issue as well.
No one is sweeping anything, we added some logging which will show even without enabling debug logging (to overcome this problem) . However without someone providing logs from core 2024.7.x there is nothing we can progress. Home assistant 2024.7 is in beta now, so fixes are not added to 2024.6.x.
You can update to latest beta now and create a new issue with data from 2024.7 or wait for it to be released next week, but as soon as someone provide new data we can try to investigate the problem.
This is still an issue, when I turn on the debugging my logs are lilke multiple gigs and then I can't upload them and struggle to whittle them down and then the upload to there for them fails. Please don't just sweeep this under the rug if others are saying they still have the issue as well.
No one is sweeping anything, we added some logging which will show even without enabling debug logging (to overcome this problem) . However without someone providing logs from core 2024.7.x there is nothing we can progress. Home assistant 2024.7 is in beta now, so fixes are not added to 2024.6.x.
You can update to latest beta now and create a new issue with data from 2024.7 or wait for it to be released next week, but as soon as someone provide new data we can try to investigate the problem.
I apologize for implying anyone might sweep anything under the rug - I will wait until 2O24 .7 next week, thank you!
This problem seems to have been resolved (there is a different issue now, I will log a separate ticket.)
This problem seems to have been resolved (there is a different issue now, I will log a separate ticket.)
@KruseLuds on 2024.7.x? And what issue appears now?
The problem
Starting with I believe core v. 2024.5.5 all of my Shelly Motion 2 devices (I have 11 of them) become "unavailable". They are all "Hardware: gen1 (SHMOS-02)" and it happens only with them (none of my other Shelly devices). The problem is always resolved by calling the service "Home Assistant Core: Reload Config Entry" for each device that is unavailable. I am currently running a healthy supported version of Home Assistant Supervised with everything up to date as shown below.
What version of Home Assistant Core has the issue?
core-2024.6.0
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant Supervised
Integration causing the issue
Shelly
Link to integration documentation on our website
https://www.home-assistant.io/integrations/shelly
Diagnostics information
There is no data to show, only the information already given
Example YAML snippet
No response
Anything in the logs that might be useful for us?
Additional information
System Information
version | core-2024.6.0 -- | -- installation_type | Home Assistant Supervised dev | false hassio | true docker | true user | root virtualenv | false python_version | 3.12.2 os_name | Linux os_version | 6.1.0-21-arm64 arch | aarch64 timezone | America/New_York config_dir | /configHome Assistant Community Store
GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 4845 Installed Version | 1.34.0 Stage | running Available Repositories | 1455 Downloaded Repositories | 28AccuWeather
can_reach_server | ok -- | -- remaining_requests | 18Home Assistant Cloud
logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | okHome Assistant Supervisor
host_os | Debian GNU/Linux 12 (bookworm) -- | -- update_channel | stable supervisor_version | supervisor-2024.06.0 agent_version | 1.6.0 docker_version | 26.1.4 disk_total | 915.4 GB disk_used | 36.1 GB healthy | true supported | true host_connectivity | true supervisor_connectivity | true ntp_synchronized | true virtualization | supervisor_api | ok version_api | ok installed_addons | AdGuard Home (5.1.0), Log Viewer (0.17.0), Home Assistant Google Drive Backup (0.112.1), File editor (5.8.0), Terminal & SSH (9.14.0), Core DNS Override (0.1.1), Matter Server (6.1.0), Cloudflared (5.1.10), Mosquitto broker (6.4.1), Ring-MQTT with Video Streaming (5.6.4)Dashboards
dashboards | 9 -- | -- resources | 20 views | 43 mode | storageRecorder
oldest_recorder_run | May 8, 2024 at 5:30 AM -- | -- current_recorder_run | June 6, 2024 at 12:23 PM estimated_db_size | 4084.03 MiB database_engine | sqlite database_version | 3.44.2