homebridge-plugins / homebridge-roomba2

Homebridge plugin to connect iRobot Roomba devices with Homebridge/HomeKit.
MIT License
143 stars 17 forks source link

Roomba disconnects from Homekit #133

Open fascpt opened 1 year ago

fascpt commented 1 year ago

I have a Roomba 980 which I have been using with this plugin for a while. I have it on fixed IP address, and generally working well, except that every 2 / 3 days after completing a clean job (scheduled from the app), it loses communication with the plug in (shows up as device not available on Homekit). The only way I’ve been able to address this is going into the iRobot app and making it reboot (which proves it has internet connectivity).

The only weird log which I can see on Homebridge is the following: 00975855-034A-4855-9E8E-E84181CEF950

Has anyone ever came across something similar? If so how did you address this?

Thanks!

asweet commented 1 year ago

I'm having a similar experience on a roomba 960. I captured some info and debug logs here Unfortunately, my solution is the same as yours, rebooting the roomba (and also rebooting the homebridge server because why not).

karlvr commented 1 year ago

Please try the latest beta 1.5.0-beta.0 (instructions for installation at https://github.com/homebridge-plugins/homebridge-roomba2/wiki/Pre-release-versions) as I have changed some of the connection logic to be more robust.

asweet commented 1 year ago

Please try the latest beta 1.5.0-beta.0 (instructions for installation at https://github.com/homebridge-plugins/homebridge-roomba2/wiki/Pre-release-versions) as I have changed some of the connection logic to be more robust.

On it. It can take a couple of runs before the time out errors appear

fascpt commented 1 year ago

Please try the latest beta 1.5.0-beta.0 (instructions for installation at https://github.com/homebridge-plugins/homebridge-roomba2/wiki/Pre-release-versions) as I have changed some of the connection logic to be more robust.

Just installed the beta, I'll keep it running for the week. Usually it needs 2 or 3 reboots per week, so by the end of this one I should have a good idea if it's working. Thanks!

fascpt commented 1 year ago

Issue is still there, now with logs slightly different: IMG_5973

fascpt commented 1 year ago

Ok, this time I think my router was misbehaving, because I couldn’t reboot the Roomba until finally I restarted the router. This doesn’t happen frequently, and I saw other devices offline too, so I think the Plugin was not at fault. I’ll keep tracking the status and update here, sorry for the confusion.

fascpt commented 1 year ago

Ok, today it failed again and this time it was not the wifi (I could reboot Roomba normally from the App). Here’s the logs: IMG_5979

karlvr commented 1 year ago

@fascpt OK, thank you, very interesting. So... what to make of this. As I understand it, the plugin successfully communicates with the Roomba, and checks its status every 15 minutes or so for a few days, and then suddenly the connection to the Roomba starts timing out and you reboot your Roomba to fix it.

Is that correct?

fascpt commented 1 year ago

@karlvr yes, and it usually seemed to coincide AFTER a daily schedule to vacuum (week days only, configured on the iRobot app). Nevertheless, the iRobot app usually doesn't lose any connection to Roomba when it's not available on Homekit, since I can easily push a reboot from the app and it picks it up immediately (it makes a rebooting sound). Another interesting part is that since the update to the beta, it seems the plugin connection was lost even before the scheduled job (it runs at 11h, but the latest logs I shared before suggests it lost communication even before that). Also, the logs are now different from the initial one ("Releasing an unexpected Roomba instance"), not sure if it was changed on the beta version (I didn't check the code changes you pushed, to be honest) - is this expected?

As I shared before, yesterday I needed to reboot it, but today even being past 11h (the scheduled job worked normally), Roomba is still available to be controlled from Homekit. I do see a timeout on the logs when Roomba finished the job, but it's still working after that event (without needing a reboot): E65F70E7-DF03-4CF9-9EF0-099E05D0458C_1_201_a

karlvr commented 1 year ago

@fascpt thanks for this info. I did make changes to the connection code to avoid the wrong instance messages.

The latest result is that we see some timeouts in the logs, but the plug-in appears to successfully reconnect again?

asweet commented 1 year ago

Please try the latest beta 1.5.0-beta.0 (instructions for installation at https://github.com/homebridge-plugins/homebridge-roomba2/wiki/Pre-release-versions) as I have changed some of the connection logic to be more robust.

On it. It can take a couple of runs before the time out errors appear

This is looking good on my end. I haven't been running it in verbose mode, but other than needing a full reboot at first (reboot roomba, reboot homebridge server, force quit the roomba app), I have seen zero of the previous errors.

fascpt commented 1 year ago

@fascpt thanks for this info. I did make changes to the connection code to avoid the wrong instance messages.

The latest result is that we see some timeouts in the logs, but the plug-in appears to successfully reconnect again?

Sorry for the late reply, I've been sick since last week.

I continue to see the same behaviour overall: Roomba starts timing out typically after the scheduled job, typically 2 / 3 times per week.

This time I did something different: because I have a cronjob to reboot Homebridge every day at 03h, I left my Roomba without forcing it a reboot to check if the plugin would recover. So I left it timing out:

Screenshot 2023-05-04 at 10 27 38

And a few days later I tested if I can ping the Roomba from the network:

Screenshot 2023-05-04 at 10 28 42

So, the ONLY thing that makes the Roomba reactive to the plugin again (even after a full reboot on the Raspberry and on Homebridge alone - tested both) is actually rebooting the Roomba. It's weird it still replies to pings, but not the plugin... It's also interesting this typically starts only AFTER a scheduled work via the app. Is there any alternative port available on the Roomba to try to send the command requests? Reason I ask is that the iRobot App works flawlessly regardless of the plugin timeouts.

My next step is to remove the scheduled jobs from the app and do it via automation on Homekit, but I think I tried this before on a previous version of the plugin.

Any other ideas how can I continue to troubleshoot?

karlvr commented 1 year ago

@fascpt I'm sorry I'm at a loss as to what's going on. As your write, it seems that Roomba is blocking the local connections after the scheduled job. It sounds like a Roomba fault. It will be interesting to see what happens if you use HomeKit automations instead.

I think the iRobot app might use cloud-based access to the Roomba in this case, which is why it is able to connect. Although I wonder what would happen if you tried connecting to Roomba from a different IP address when it's in this state. As you note, it's pingable, maybe it's blocking our IP. If you're able, perhaps try the npm run getlastcommand robot command (see README) from your Homebridge IP and another IP to see if you can connect to Roomba?

fascpt commented 1 year ago

@fascpt I'm sorry I'm at a loss as to what's going on. As your write, it seems that Roomba is blocking the local connections after the scheduled job. It sounds like a Roomba fault. It will be interesting to see what happens if you use HomeKit automations instead.

I think the iRobot app might use cloud-based access to the Roomba in this case, which is why it is able to connect. Although I wonder what would happen if you tried connecting to Roomba from a different IP address when it's in this state. As you note, it's pingable, maybe it's blocking our IP. If you're able, perhaps try the npm run getlastcommand robot command (see README) from your Homebridge IP and another IP to see if you can connect to Roomba?

So the ping was actually from another device and not from the Raspberry Pi, so I'll try to ping it and the getlastcommand from it next time. Today was the first day running the automation from Homekit, to which I got the following errors as soon it docked. It eventually recovered connectivity and Roomba is still available on Homekit, without any manual intervention:

Screenshot 2023-05-10 at 13 05 16

Question: Are these connection refused errors normal? I see them every once in a while on the logs, and not necessarily right after a cleaning job.

I'll keep an eye on the logs over the next days, and the next thing I can also try is rebooting the routers (I have a meshed network) and see if the Roomba gets back accepting requests, without the need of itself rebooting. This can troubleshoot if it's a weird Router / Roomba combination which is not working well, or if it's really a Roomba bug on the 980 model...

karlvr commented 1 year ago

@fascpt I'm not sure whether the connection refused errors are normal; it seems odd, but if it recovers eventually then it's just like Roomba was sleeping or something, and eventually woke back up. Given that the iRobot app seems able to get through all the time I feel like we're missing a trick with respect to waking Roomba up or something, but it doesn't seem like anyone has cracked it ¯_(ツ)_/¯

fascpt commented 1 year ago

I've been in radio silence as I tested multiple options and troubleshooting my networking environment. I've came to conclusion my Roomba and router are not a great match and somehow the Roomba loses communication with the plugin on the Raspberry Pi in a very predictable fashion. During these outage periods I still have ICMP (ping) connectivity between the Raspberry and the Roomba even when it loses connectivity to the plugin, but the npm run getlastcommand doesn't work during that outage period either. I've found that 90% of the outage periods start after a successful clean job (either manually initiated from the iRobot or Home apps or scheduled on the iRobot one). Since I've got a meshed solution with 2 Asus RT-AX92U configured as 1 master router + 1 AiMesh node, I've created a non-roaming binding rule to between the downstairs router and the Roomba (it only cleans downstairs), in the hope any roaming attempts could be part of the issue, but unfortunately the behaviour didn't change. The only workaround I've found to these outages are either to reboot the Roomba via the app, or reboot the routers manually. As I wasn't able to solve the issue so far, I've scheduled a daily reboot to both routers (it was set to reboot weekly), and this has limited the impact overall but very far from ideal.

If no one else is suffering from the same problems I have above, I'll take it as a routing issue between the router and the Roomba and would advise to close this issue. I hope my description above can be useful for anyone having the same issues as I have in the future.

karlvr commented 1 year ago

@fascpt thanks Filipe, what a frustrating situation. I suspected the Roomba itself isn't playing nicely with the MQTT socket, but it's interesting that rebooting the router resolves the issue. I guess given that rebooting the router / wifi hotspot forces the Roomba to reconnect to the WiFi that that is clearing whatever situation exists. Still frustrating!

white8398 commented 1 year ago

Hello [fascpt] I am also seeing the same issue with a totally different router then yours and was hoping that they have found a solution to it.

Irobot issue

Tried running in Debug but not much info:

Irobot issue1

karlvr commented 1 year ago

@white8398 thanks for reporting that. I do suspect that it's an issue with the Roomba itself... maybe there will be a firmware update...

fascpt commented 1 year ago

@white8398 thanks for reporting that. I do suspect that it's an issue with the Roomba itself... maybe there will be a firmware update...

I wouldn't hold my breath on that one! 😁 Thanks everyone, seems clearly a Roomba firmware bug.

@white8398 which Roomba do you have? Mine's a 980.

CezaryCiosek commented 8 months ago

Restarting Roomba by holding start cleaning button + return to home button helps for day or two :( Roomba 980