home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
69.76k stars 28.91k forks source link

Wemo switches unavailable in HASS but still controllable by mobile Wemo app #62224

Closed DavidRBailey closed 1 year ago

DavidRBailey commented 2 years ago

The problem

Some Wemo switches in Home Assistant gray out in the dashboard showing they are unavailable, yet they remain available and controllable from the Wemo iOS/Android app.

Restarting the Wemo switch resolves the problem, but it eventually happens again.

The network is reporting no errors with good connectivity to the switches.

The issue seems most common with the Wemo Light Switch 2nd Gen or V2 (WLS040) and Wemo Smart Light Switch 3-Way (WLS0403), and doesn't seem to be common with the Wemo Smart Dimmer Switch (WDS060).

What version of Home Assistant Core has the issue?

2021.11.5

What was the last working version of Home Assistant Core?

No response

What type of installation are you running?

Home Assistant OS

Integration causing the issue

Wemo

Link to integration documentation on our website

https://www.home-assistant.io/integrations/wemo/

Example YAML snippet

No response

Anything in the logs that might be useful for us?

2021-12-12 17:14:18 ERROR (Wemo Events Thread) [pywemo.ouimeaux_device] Unable to reconnect with Bathroom Fan

2021-12-12 17:14:18 WARNING (Wemo Events Thread) [pywemo.subscribe] Resubscribe error for <Subscription basicevent "Downstairs Bathroom Fan"> (HTTPConnectionPool(host='192.168.7.97', port=49153): Max retries exceeded with url: /upnp/event/basicevent1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5dfd9610>: Failed to establish a new connection: [Errno 111] Connection refused'))), will retry in 60s

2021-12-12 17:14:18 ERROR (Wemo Events Thread) [pywemo.ouimeaux_device] Unable to re-probe wemo <WeMo LightSwitch "Downstairs Bathroom Fan"> at 192.168.7.97

2021-12-12 17:14:23 ERROR (Wemo Events Thread) [pywemo.ouimeaux_device] Unable to reconnect with Downstairs Bathroom Fan

2021-12-12 17:14:29 WARNING (SyncWorker_7) [urllib3.connectionpool] Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f89acd640>: Failed to establish a new connection: [Errno 111] Connection refused')': /upnp/control/basicevent1

2021-12-12 17:14:29 WARNING (SyncWorker_7) [pywemo.ouimeaux_device.api.service] Error communicating with Downstairs Bathroom Fan at 192.168.7.97:49153, HTTPException(MaxRetryError("HTTPConnectionPool(host='192.168.7.97', port=49153): Max retries exceeded with url: /upnp/control/basicevent1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f53027760>: Failed to establish a new connection: [Errno 111] Connection refused'))")) retry 1

2021-12-12 17:14:29 ERROR (SyncWorker_7) [pywemo.ouimeaux_device] Unable to re-probe wemo <WeMo LightSwitch "Downstairs Bathroom Fan"> at 192.168.7.97

2021-12-12 17:14:34 ERROR (SyncWorker_7) [pywemo.ouimeaux_device] Unable to reconnect with Downstairs Bathroom Fan

2021-12-12 17:14:34 WARNING (SyncWorker_7) [urllib3.connectionpool] Retrying (Retry(total=5, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5a6f29d0>: Failed to establish a new connection: [Errno 111] Connection refused')': /upnp/control/basicevent1

2021-12-12 17:14:38 WARNING (SyncWorker_7) [urllib3.connectionpool] Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8d543940>: Failed to establish a new connection: [Errno 111] Connection refused')': /upnp/control/basicevent1

Additional information

See other similar reports documented here- https://community.home-assistant.io/t/wemo-switches-unavailable-in-hass-but-can-still-be-controlled-by-mobile-wemo-app/360147

probot-home-assistant[bot] commented 2 years ago

Hey there @esev, mind taking a look at this issue as it has been labeled with an integration (wemo) you are listed as a code owner for? Thanks! (message by CodeOwnersMention)


wemo documentation wemo source (message by IssueLinks)

esev commented 2 years ago

Connection refused is odd.

Are these devices on the same LAN/subnet as Home Assistant? Are the devices being discovered automatically, or are these devices using static configuration.

mlohus93 commented 2 years ago

Though I am not the original poster, I have the same problem as described, though rebooting HA rarely resolves the unavailable devices... occasionally they wake up themselves, but only consistent way to get them back is to reboot the device. Agree its the V2's, in my case it's the V2 single switch, and the V2 3-way switches. I do not have the Wemo dimmers. I do have mine using the Static configuration in my configuration.yaml file... all of my Wemos are defined with DHCP reservations. I had the problem with my Apple AirPort Extreme network (one as Router, two more setup as Access Points with wired connection to router). I recently replaced with an Eero WiFi 6 mesh system (router and two nodes). Same problems exist. I saw in the Wemo integration a recommendation to have port 8989 open on your firewall... unfortunately I cannot open it across all/multiple IPs (or even all Wemo IPs), only port forwarding a port to a single IP/MAC address; same scenario with Apple network and Eero network. Finally yes, all on the same LAN/Subnet as Home Assistant. Incidentally, the Wemos which have gone Unavailable are not only alive and kicking in the Wemo App as mentioned by OP, but they are also fully functional via HomeKit all the time they are unavailable in Home Assistant. I am running HA Core 2021.12.2, Supervisor 2021.12.2, OS Home Assistant OS 7.0 on an Home Assistant Blue (ODroid). I have a backup HA system powered down and every couple of months brought up long enough to update and back shutdown... it is an Intel box running Debian 11 Supervised HA. However a couple of months ago (and again a couple weeks ago) I ran it instead of my HA-Blue for a couple of days only to find it has the same problem with Wemos going unavailable. Thought maybe that info might be helpful.

esev commented 2 years ago

@mlohus93 are you seeing the same Failed to establish a new connection: [Errno 111] Connection refused in the log messages? Or are you seeing Connection to <<ip_address>> timed out?

DavidRBailey commented 2 years ago

Connection refused is odd.

Are these devices on the same LAN/subnet as Home Assistant? Are the devices being discovered automatically, or are these devices using static configuration.

In my case, they are on the same LAN/subnet. Home Assistant is on an ethernet cable and the switches are on Wifi on the same subnet. They are being automatically discovered, but have static DHCP assignments, so their IP addresses don't change.

I wonder if HASS has a lower-bar for thinking a device is unavailable than the Wemo app? Perhaps it doesn't always respond within a specified timeframe?

The crazy thing is HASS shows the switch as unavailable, and even if I try to switch it, it doesn't work, while at the same time, I can go into the Wemo app and turn the light or fan on and off using the same network connection.

esev commented 2 years ago

Connection refused means the device is connected to wifi on the local subnet and is refusing requests on the local uPnP API. It's responding with the equivalent of "I don't support this service".

My guess is the device itself is in some kind of hung state where it's not responding on the local network and not responding to physical presses, but is still communicating with Belkin's cloud service.

AFAIK there is no way to communicate with the device when it gets into this state. And unfortunately, Belkin's cloud service does not have a published API.

DavidRBailey commented 2 years ago

So what you're saying is the app is using a different method of communication with the switches than HASS? Is there a way to leverage the other method?

esev commented 2 years ago

That's correct. The WeMo devices have at least two communications methods:

  1. There is an UPnP API that can be accessed on the device itself.
  2. And there is a cloud service that Belkin runs that the device connects to.

Home Assistant uses (1) and the Belkin WeMo app uses (2). Belkin does not make their cloud service available publicly. So there is no way for Home Assistant (or pywemo) to use that.

It looks like when the device gets into this state, the local UPnP service gets disabled somehow on the device. The only way to restart it may be to restart the device itself. It may be that the device is defective too, or Belkin has shipped them with buggy firmware.

DavidRBailey commented 2 years ago

This is a long shot, but is there any way to remotely request a switch restart that might resolve the issue without requiring someone to walk to the switch and manually reset it?

Maybe something like- device.reset(data=False, wifi=False)

https://pypi.org/project/pywemo/#device-reset-and-setup

esev commented 2 years ago

This is a long shot, but is there any way to remotely request a switch restart that might resolve the issue without requiring someone to walk to the switch and manually reset it?

Maybe something like- device.reset(data=False, wifi=False)

https://pypi.org/project/pywemo/#device-reset-and-setup

I wish it would work. But pywemo uses the local UPnP API. And it's this API that has the Connection refused issue.

DavidRBailey commented 2 years ago

I read here that the Wemo UPnP service is buggy and can crash if an invalid argument is sent to it. Is it possible that there could be bad commands going to the Wemo switches causing the service to fail?

esev commented 2 years ago

That is possible, yes. If that is the case there should be a log message about it. Do you see any other log messages related to pywemo/wemo?

DavidRBailey commented 2 years ago

I'm opening a ticket with Wemo. I'll reset the switches and recheck the logs.

DavidRBailey commented 2 years ago

I can confirm @mlohus93 's report that the Wemo Smart Light Switch 3-Way (WLS0403) is also showing this same behavior.

@esev As far as HASS goes, the errors in the log just stop occurring once the Wemo switch is restarted and HASS can again communicate with it. Is there a debug mode I should enable in HASS to get more information?

At this point, the system apparently functions fine until the next Wemo switch UPnP service failure.

DavidRBailey commented 2 years ago

The Wemo case number is #14310240. If you're having similar problems you might want to contact Wemo and refer to this case.

I had to explain multiple times that the failure in the service is on the Wemo switch, not the third-party home automation software. BTW- I had a similar problem on Homebridge that I used to run before I switched over to HASS.

DavidRBailey commented 2 years ago

A longer log file that might have useful information. I've removed messages from a few other integrations that have nothing to do with this.

home-assistant.log .

mlohus93 commented 2 years ago

@esev - you asked "are you seeing the same Failed to establish a new connection: [Errno 111] Connection refused in the log messages? Or are you seeing Connection to <> timed out?"... The answer is both. Looking back an one of my switches which went unavailable around 22:31ET tonight, around 22:26ET I got an error line stating that switch had a Resubscribe error, included a comment about max retries exceeded and then the line ended the Timed Out error you mentioned. The following lines in the error log included the "Failed to establish a new connection: [Errno 111] Connection refused" errors about every 60 seconds thereafter. Oh and some Unable to re-probe wemo errors and unable to get description errors too. It is a hot mess... this is just one of many V2 switches I have which act this way. Attaching an excerpt of the log for this one particular Wemo.

Foyer-Wemo-Errors.txt .

esev commented 2 years ago

@DavidRBailey could you double-check something for me? Are all of these dimmers? If any of them are switches, or three-way switches, please let me know. This is unrelated to the issue you're having, just want to make sure what I've done properly gets rid of those Failed to enable long press support for device error messages in your log. I'm expecting them all to be the WDS060 model.

esev commented 2 years ago

@DavidRBailey to get more debugging, you can change the log level within pywemo with the logger.set_level service.

https://www.home-assistant.io/integrations/logger/#service-set_level

service: logger.set_level
data: 
  pywemo: debug

Screenshot 2021-12-19 11 14 10 AM

Set it back to warning or error if it causes too much logging.

esev commented 2 years ago

Kind of a long-shot here, but if either of you are comfortable with modifying the files inside your Home Assistant instance, could you try this? https://github.com/pywemo/pywemo/issues/275#issuecomment-927145197

The down-side is that, when you press the light switch manually, it can take a while before the change shows-up in Home Assistant. Beware though that this specifically has a bad impact on WeMo motion detectors; Home Assistant may not even see motion being triggered with that change.

DavidRBailey commented 2 years ago

@DavidRBailey could you double-check something for me? Are all of these dimmers? If any of them are switches, or three-way switches, please let me know. This is unrelated to the issue you're having, just want to make sure what I've done properly gets rid of those Failed to enable long press support for device error messages in your log. I'm expecting them all to be the WDS060 model.

All my lights, with the exception of the outdoor ones and the stairway 3-way switches are on dimmers, so yes. These are all WDS060 Smart Dimmer switches.

DavidRBailey commented 2 years ago
service: logger.set_level
data: 
  pywemo: debug

When I put this into the top of my configuration.yaml file, I get the following error when trying to restart Core.

Failed to restart Home AssistantCore

The system cannot restart because the configuration is not valid: Integration error:
service Integration 'service' not found. Integration error: data Integration 'data'
not found.
DavidRBailey commented 2 years ago

Kind of a long-shot here, but if either of you are comfortable with modifying the files inside your Home Assistant instance, could you try this? pywemo/pywemo#275 (comment)

The down-side is that, when you press the light switch manually, it can take a while before the change shows-up in Home Assistant. Beware though that this specifically has a bad impact on WeMo motion detectors; Home Assistant may not even see motion being triggered with that change.

Sure. I'll give that a try.

Where can I find wemo_device.py? (Forgive me, I recently migrated from HOOBs to Home Assistant.)

DavidRBailey commented 2 years ago

FYI- I spoke with a second-level tech at Wemo/Belkin regarding case number #14310240.

They first tried to tell me that it was the problem of "third-party software" (Home Assistant) that they don't support. I told them it was ALL third-party software, including Homebridge that was affected. Then they told me Wemo only supports Apple HomeKit, Google Home, and Alexa.

I then told them that it didn't matter which software I used, that even command line UPnP tools were failing once the switch's UPnP service stopped responding and the only way to reliably fix it was to restart/reboot the switch. They then tried to tell me that the UPnP API was no longer being used on the Wemo switches once they had been added to the Wemo Cloud. (e.g. it's just an initial set up service)

I pointed them to this page which says that the UPnP interface is the method that all third-party developers are integrating with Wemo switches- http://developers.belkin.com/wemo/sdk

They said they'd get back to me. At this point, I wouldn't be surprised if they take down the page.

Who wants to take on reverse engineering the Wemo firmware so we can create an open source version?

esev commented 2 years ago

When I put this into the top of my configuration.yaml file,

Oh oops. That wasn't meant to go in the configuration.yaml file. It was meant to be called via the Developer Tools -> SERVICES tab.

Here is what would go into configuration.yaml:

logger:
  default: info
  logs:
    pywemo: debug
esev commented 2 years ago

Where can I find wemo_device.py? (Forgive me, I recently migrated from HOOBs to Home Assistant.)

Probably the easiest way to find it would be to run: find / -name wemo_device.py

They said they'd get back to me. At this point, I wouldn't be surprised if they take down the page.

Doh! :)

Who wants to take on reverse engineering the Wemo firmware so we can create an open source version?

There are some tips on how to get started here: https://github.com/pywemo/pywemo/wiki/WeMo-Firmware

DavidRBailey commented 2 years ago

Probably the easiest way to find it would be to run: find / -name wemo_device.py

I tried that with no result, and trying sudo or su didn't seem to help.

I'm running HASSIO core-2021.11.5.

esev commented 2 years ago

Probably the easiest way to find it would be to run: find / -name wemo_device.py

I tried that with no result.

Oh! I'm not sure then. I use Home Assistant Docker not Home Assistant OS. If they're the same, the file lives here on my Docker instance: /usr/src/homeassistant/homeassistant/components/wemo/wemo_device.py

DavidRBailey commented 2 years ago

Hmmm.

[core-ssh ~]$ find / -iname homeassistant
/config/blueprints/automation/homeassistant
/config/blueprints/script/homeassistant

They must hide the files from the command line access.

esev commented 2 years ago

@DavidRBailey That's really odd! I see the same path as my docker container in your logs too. Weird.

Here is the path from your home-assistant.log

Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/components/wemo/wemo_device.py", line 152, in async_register_device
    await hass.async_add_executor_job(wemo.ensure_long_press_virtual_device)
DavidRBailey commented 2 years ago

Aha!

https://community.home-assistant.io/t/how-to-get-access-at-damn-host-system/96549/7

mlohus93 commented 2 years ago

FYI- I spoke with a second-level tech at Wemo/Belkin regarding case number #14310240.

They first tried to tell me that it was the problem of "third-party software" (Home Assistant) that they don't support. I told them it was ALL third-party software, including Homebridge that was affected. Then they told me Wemo only supports Apple HomeKit, Google Home, and Alexa.

@esev and @DavidRBailey Something I have noticed... devices which go Unavailable in Home Assistant ALSO go unavailable in Alexa at the same time (whereas HomeKit and Wemo App remain fully operational to those same devices). So it's a problem with one of the things they do support... Alexa. That said, allow me to clarify that I do not use the Wemo skill for Alexa, I let Alexa discover my Wemos on her own (which she does quite well) because I was getting duplicate Wemo entities in Alexa until I ditched the Alexa/Wemo skill.

Makes me think that maybe Alexa uses the same UPNP API as Home Assistant does?

esev commented 2 years ago

Makes me think that maybe Alexa uses the same UPNP API as Home Assistant does?

@mlohus93 it totally does. In fact there are a few software libraries that emulate WeMo devices for exactly this reason. It allows those devices to be easily supported by Alexa.

I'm working on some unintended consequences of such emulators right now in #62259. :)

esev commented 2 years ago

Wait... @DavidRBailey do you have an Alexa too? I wonder if the WeMo doesn't like it when more than one device tries to subscribe to state push events.

DavidRBailey commented 2 years ago

Nope. I did attach them to Homekit for awhile, but I reset them to put them into Home Assistant and would prefer to do it through the Home Assistant Homekit gateway because of the better functionality.

mlohus93 commented 2 years ago

Kind of a long-shot here, but if either of you are comfortable with modifying the files inside your Home Assistant instance, could you try this? pywemo/pywemo#275 (comment)

@esev I couldn't find the wemo_device.py on my HA Blue either, but shut it down and brought up my Debian 11 Supervised HA machine, restored it to the backup of HA Blue from last night, and commented out the three lines in the wemo_device.py file. I restarted my Debian 11 HA box and restarted all my V2 (1-way and 3-way) switches. Currently all switches are showing available. I will report back how my switches behave (or dont). Thanks!

mlohus93 commented 2 years ago

Wait... @DavidRBailey do you have an Alexa too? I wonder if the WeMo doesn't like it when more than one device tries to subscribe to state push events.

I do have Alexa and I wondered about that too, though David says he doesnt and we have similar problems. And honestly there is no way short of unplugging all the Alexas to get them to stop grabbing my Wemos. Like I said, I dont even have the Alexa skill and they still manage to find my Wemos.

mlohus93 commented 2 years ago

@esev one more thought, this unavailability problem on the V2s is a fairly recent problem... I cant say what recent is (6 months maybe???) just that it wasn't always this way... or if it was, it's gotten dramatically worse and more frequent failures. I honestly wondered if it was a Belkin/Wemo firmware update on the V2's that caused the problem. Not because I started noticing it after a firmware update, but because it was only the V2s that were on the struggle-bus. I have been using HA since spring of 2020 (good COVID project and the Wemo Integration is why I started), I have had Wemos for years... and Alexas since they came out long before I jumped on HA.

esev commented 2 years ago

@mlohus93 Firmware changes could cause this too. Good call.

I'm planning to make a change soon to display the firmware version in Home Assistant (along with a few other debugging details like wifi signal strength & uptime). Then it'll be easier to compare. FWIW my WLS040 & WLS0403 are on WeMo_WW_2.00.11563.PVT-OWRT-LIGHTV2

mlohus93 commented 2 years ago

@esev - update on my Wemo world after commenting out the three lines in wemo_device.py on my Debian 11 Supervised HA installation (it’s my backup/alternate HA device). I completed that exercise yesterday evening, but wanted it to “soak” for a while. A little more than 24 hours has passed and my observations are: 1) Not a single V2 1-way or 3-way switch has gone unavailable since the lines were commented out, every one has continued to be available and operational. This has not happened in months. 2) after about 20 hours my Wemo Outdoors switch went unavailable for about an hour and then returned to life. 3) You are correct, it takes about 15-20 seconds from when a light switch is manually/directly at-the-switch toggled until it registers such in HA. However when I toggle from within HA, the switch reacts immediately. I don’t have any Wemo motion detectors, so that lag in appearance within HA isn’t a concern to me. 4) Still haven’t unlocked the magic to getting into my HA Blue box to where I can find/edit the wemo_devices.py file but I also haven’t had time to try again.

esev commented 2 years ago

@mlohus93 That's great news. A while back, I made a PR so you can enable/disable the subscription through the integration configuration. That would make it so you no longer need to modify the files directly. It hasn't been reviewed yet though. #56972. Let's give it a few more days. If we can say conclusively that this fixes the issue, I'll modify that PR so it is considered a bug fix. Might get more attention that way.

It's good that we've potentially narrowed the issue down. I wonder what it is about the subscriptions that causes these switches to fail?

mlohus93 commented 2 years ago

I'm planning to make a change soon to display the firmware version in Home Assistant (along with a few other debugging details like wifi signal strength & uptime). Then it'll be easier to compare. FWIW my WLS040 & WLS0403 are on WeMo_WW_2.00.11563.PVT-OWRT-LIGHTV2

@esev this is an OUTSTANDING idea! Just the WiFi signal info alone would be huge (sorely missing from Wemos own app), but uptime and firmware info within HA would be great!.

Because nothing existed and I had a feeling this was a real problem, I have had a couple of automations in place since 10/31 for the purpose of tracking uptime (well, really downtime). After a Wemo (of any kind) goes "unavailable" for 3 minutes, HA sends me a push notification and logs (concatenates) the date/time/switch/status info in a file for later review. Likewise, when it comes back online (whether on its own or at my hand with a reboot of the wemo), I get a push and entry logged that is alive. This helped me figure out the problem isn't in my head, it isn't rare, and it wasn't getting better. Your uptime mechanism would sure be welcomed.

DavidRBailey commented 2 years ago

I just got off the line with Belkin/Wemo regarding case #14310240. They explained that UPnP service is going away and third-party integration is not supported. Wemo Cloud is the only way supported for integration and only with authorized partners. No further software fixes are planned for the UPnP service.

I explained that the information I obtained at the time of purchase did not specify this, and all official communications made it appear as though the (UPnP) service allowing for third-party integration was going to be maintained. I have requested return merchandise authorization (RMA) for all Wemo switches I own that have this problem I have purchased in the last six months. I will be finding another switch that allows third-party integration. I'm going to keep the dimmers because they seem not to have the immediate problem, and I'm not keen on swapping out the twenty or more of them I've purchased in the last six months. However, I will be refusing all future firmware updates to all Wemo devices to keep them functioning this way until they are replaced. To express my dissatisfaction, I plan to no longer purchase Wemo/Belkin products.

I'll continue to help with this troubleshooting ticket so we can tiptoe around the flakiness of the UPnP service, but frankly, it looks like the only supported way in the future to communicate with these devices is going to be through HomeKit, Alexa, or Google Home bridges/integrations. This makes me sad- Wemo has had its issues, but I really liked some of their features, and other than this issue, they have been quite stable for me. Wemo Cloud can't begin to be as fast or stable as a direct network connection.

esev commented 2 years ago

It's sad to see things going this way. I bought mine before there was a cloud service specifically because of the UPnP local control. That was the "stand-out" feature that made me choose WeMo from the start.

The newer models of the plug-in switches are not very conforming to the UPnP standards already. And the most recent dimmers also dropped support for accessing the Long Press feature over UPnP.

I wish they were using an ESP chip so they could be flashed with ESPHome. Unfortunately I don't see any other wifi-based switches/dimmers that have strong support for local control. ZWave/Zigbee seems to be the better / future-proof solution.

DavidRBailey commented 2 years ago

Connectivity Standards Alliance (CSA) Matter shows that vendor lock-in is not the way of the future- interoperable, open standards are. I think Belkin/Wemo is cooking their goose.

When you get Apple, Google, Amazon, Samsung and the largest IoT chipmakers, smart home vendors and resellers all on the same page, it is the direction that industry is going. Belkin is going to lose out unless they join in that industry direction.

mlohus93 commented 2 years ago

@esev As I approach the 48 hour point with the 3-lines commented out of wemo_devices.py, I am happy to say its FAR better than it has been, but sadly not bullet proof. This afternoon I have had my first two V2 switches go Unavailable... both are 3-way light switches. Please be aware that the timestamp below is the point where my HA Automation recognized that the switch had been unavailable for at least 3 minutes, so it wont correspond perfectly with the actual point in time when the switch took a dive, it ought to be close though:

I copy/pasted my log file into Excel and filtered out just the entries referencing the specific devices because I cannot imagine you would want my whole log file... the two files are attached if they provide any clues.

Seriously though, I literally was having at least five or six of my eleven V2 switches take a dive on me every day... and if I rebooted the unavailable switches, the odds were half would fail again that same day (not always the same ones, fairly random)... so only two in 48 hours is a huge improvement.

LOG Family Room Overhead Light.txt

LOG Living Room Stand Light.txt

mlohus93 commented 2 years ago

Wemo bailing on UPnP service and thus the most direct route for HA is really upsetting. I have more Wemos than I might want to admit to, been really pretty satisfied with them as being the best answer out there for me... and really at the time the only company to make a true 3-way switch.

I suppose when an HA to Wemo path finally stops working I will have to replace the remainder of my oldest Wemo switches for later ones with HomeKit built in, and then rely on the Home Assistant Homekit bridge integration to tie HA to my Wemos. Not pretty, but I just have not been dazzled by my steps into Zigbee and zWave so far... and not sure I want to make another big investment.

So frustrating.

mlohus93 commented 2 years ago

@esev Final Parting thoughts for the night, if there is value in the info...

I am attaching two more files. The Backyard Tree file are HA Log entries related to a new Wemo Outside Plug device I recently bought... it has gone in and out on its own a couple of times, but not requiring a manual reboot.

The All Other Wemo Messages file is about 1500 lines long, and contains all the Wemo-related messages my HA Log contains which are not in one of the files I have already provided tonight. Maybe some insight into the wacky messages/behavior of the Wemos (even if they aren't actually going unavailable)... and many are not V2's, they are a mix of plugs and V1 and V2 switches.

Backyard Tree.txt

All Other Wemo Messages.txt

mlohus93 commented 2 years ago

@esev Not a lot to report tonight beyond what are likely self-inflicted injuries

Out of curiosity late last night I turned on an Advanced Security feature of my Eero network to "prevent access to sites that host malicious content or viruses, botnets, phishing sites, and more." When I got up this morning I was met with a laundry list of pretty much every wemo switch, wemo plug and wemo maker (V1, V2, everything) which had been flipping unavailable and back online all night long. Thinking that Eero "feature" caused it, I turned it back off (even though it claimed it hadn't actually blocked any traffic/requests). Same problems continued. Rebooted network. The frenzy of problems slowed a bit but definitely continued. I noticed there was a core update to 2021.12.4 which I went ahead and applied (which naturally restarted HA). The problems continued. I rebooted every switch, plug, and maker in the house. Problems continued, again seemed to be slower occurring but still things were going up and down. Finally it dawned on me to check out the wemo_devices.py file and sure enough the three lines you had me comment out were back to normal (not commented out). I returned those to being commented out, restarted HA, and everything immediately settled down with no more unavailable bouncing. At that point I had one 3-way V2 switch (foyer lights) which didn't return on it's own so I rebooted it (a couple of times) and everything was up, stable, and happy again. I did get a chuckle out of the Foyer Lights error in HA "Error communicating with Foyer Lights after 3 attempts. Giving up." I get that, I have felt that way myself lately.

Honestly wish I had checked the three lines in the py file first, I just didn't think of that. I don't know if something uncommented them last night, if the Eero security feature caused problems and the uncommenting happened as a result of the 2021.12.4 update, I don't know. Not the grandest troubleshooting methodology; sorry about that. However since I got the world settled down, everything has been again 100% stable, nothing has gone unavailable.

Incidentally, I am likely going to revert to my HA Blue primary box tomorrow night.. this Debian box doesnt have a Zigbee gateway "plug" so there are a couple things which are not working like an Aqara button on the Garage wall for opening/closing the garage door and a leak sensor... my wife would really appreciate once again being able to put up the door before she gets in the car. I will try again to find a way to get to the wemo_devices.py file in the HA Blue box; it is illusive... probably wouldn't have gotten it if I had known how extremely restrictive it would be... but here we are.

mlohus93 commented 2 years ago

@esev just an observation tonight, one which shows how little I understand about Docker I am sure.

I have been gone since early this morning, just arrived home to find about 5 of my V2 switches unavailable. I have done nothing to HA or my network since yesterday afternoon when I got everything working again. This was surprising to me. When I checked, one of the three instances of wemo_devices.py on my Debian 11 HA machine had become uncommented out for the three lines you had me previously comment out. I dont know why anything would have changed, but something did. I rebooted the switches and expect it will be happier.