EHylands / homebridge-boschcontrolpanel_bgseries

Homebridge plugin for Bosch Control Panels
MIT License
7 stars 0 forks source link

Legacy Mode - Panel Reconnection Error #15

Closed brycetaplin closed 1 month ago

brycetaplin commented 1 year ago

Hi @EHylands

The plug-in has been working brilliantly since we last spoke, however just recently I lose connection to the panel.

I've had two different error messages develop with the same result, see below.

[10/02/2023, 20:08:27] [homebridge-boschcontrolpanel_bgseries] Control Panel Connection Error (read ECONNRESET)
[10/02/2023, 20:09:27] [homebridge-boschcontrolpanel_bgseries] Trying to reconnect ....

and

[11/02/2023, 14:47:51] [homebridge-boschcontrolpanel_bgseries] Control Panel Connection Error (Timeout)
[11/02/2023, 14:48:51] [homebridge-boschcontrolpanel_bgseries] Trying to reconnect ....

No configuration changes were made to either the plug-in or Homebridge or the panel itself.

I updated to the new 0.7.0 and got the same error:

[11/02/2023, 23:08:16] [homebridge-boschcontrolpanel_bgseries] Control Panel Connection Error (Timeout)
[11/02/2023, 23:09:16] [homebridge-boschcontrolpanel_bgseries] Trying to reconnect ....

Restarting the Bridge reconnects it every time.

Thoughts? Any tests I can run for you to get more information?

For information the connection was up for about 2.5 hrs before that error reported.

[11/02/2023, 20:45:42] [homebridge-boschcontrolpanel_bgseries] Loaded homebridge-boschcontrolpanel_bgseries v0.7.0 child bridge successfully
EHylands commented 1 year ago

Hi @brycetaplin , Would you mind sharing the start of you log file when run in debug mode ?Knowing your panel type and plugin configuration will help me go through this issue.

You say this issue is happening with previous 0.6.9 version and new 0.7.0 ?

This error comes straight from the typescript tls socket timing out. Any recent changes to your physical network or homebridge server ?

In normal pushed notifications mode (US panels), the panels sends an empty confidence message every 2 min to keep the connection alive. In legacy mode (AU panels), panel information is updated close to every 2 seconds and socket should also not time out.

EHylands commented 1 year ago

@brycetaplin Also, does you plugin successfully reconnects to your panel after timeout period ?

brycetaplin commented 1 year ago

Hi @EHylands

Yes issue occurred in 0.6.9 and 0.7.0.

Log file details, below. No changes made to network or homebridge server.

And no, it never reconnects. The only way I can get it to reconnect is to reset the bridge.

[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Bosch Control Panel Information
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Type: Solution3000
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Firmware: 2.0
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] RPS Version: 5.2.0
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Intrusion Protocol Version: 2.3.0
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Execute Protocol Version: 2.22.0
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Areas: 2
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Points: 16
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Outputs: 3
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Users: 32
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Keypads: 4
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Max Doors: 0
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Legacy Mode: true
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Panel Using Subscriptions: false
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Area1: House
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point1: Front Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point2: Lounge Room
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point3: Laundry Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point4: Laundry
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point5: Sitting Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point6: Sitting Room
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point7: Level 1
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point8: Garage Tilt
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point9: Garage Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point10: Garage
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Area2: Studio
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries]   Point12: Studio
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Configuring Homebridge plugin accessories
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Security System: Area1 - House
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] ContactSensor : Point1 - Front Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] MotionSensor : Point2 - Lounge Room
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] ContactSensor : Point3 - Laundry Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] MotionSensor : Point4 - Laundry
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] ContactSensor : Point5 - Sitting Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] MotionSensor : Point6 - Sitting Room
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] MotionSensor : Point7 - Level 1
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] ContactSensor : Point8 - Garage Tilt
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] ContactSensor : Point9 - Garage Door
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] MotionSensor : Point10 - Garage
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Burglary Alarm (Monitoring all Panel Areas) - Master Burglary Alarm
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] Starting Control Panel Operation
[12/02/2023, 10:10:06] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[12/02/2023, 10:10:07] [homebridge-boschcontrolpanel_bgseries] Panel: Point1(Front Door): Open
EHylands commented 1 year ago

I have been thinking about your issue since yesterday. Nothing quite clear comes up my mind ...

Does the plugin works as expected, but suddenly stop responding at some point ?

Is Homebridge still accessible when the plugin fails ? Other plugins continue working?

When you say you have to reset your bridge, are you restarting Homebridge process on your server or rebooting you physical server running Homebridge?

Are you running your Homebridge server on a wifi link or over a wired connection?

Have you tried power cycling your Solution panel ? Another user had some uncleared alarms in panel memory that would not show up on the keypad and were messing with the plugin pooling sequence (should be fixed in 0.7.0 though).

brycetaplin commented 1 year ago

I have been thinking about your issue since yesterday. Nothing quite clear comes up my mind ...

Does the plugin works as expected, but suddenly stop responding at some point ?

Yes, that's correct. IT will work for a few hours and then we get the error indicating it has lost connection.

Is Homebridge still accessible when the plugin fails ? Other plugins continue working?

Yes, homebridge and all other plug-ins are still accessible and working.

When you say you have to reset your bridge, are you restarting Homebridge process on your server or rebooting you physical server running Homebridge?

I run child-bridges for each of the different plug-ins on my homebridge server. I can either restart the child-bridge for this plug-in or I can restart the homebridge server (but not the actual box it is running on).

Either of those actions will enable the plug-in to re-establish a connection with the panel.

Are you running your Homebridge server on a wifi link or over a wired connection?

I am running my homebridge server via a wi-fi connection but the panel is on a wired connection.

Have you tried power cycling your Solution panel ? Another user had some uncleared alarms in panel memory that would not show up on the keypad and were messing with the plugin pooling sequence (should be fixed in 0.7.0 though).

Not yet, although I can do and will see if that makes a difference.

Another point of note is that while the plug-in is working correctly, I cannot access the panel via the RSC+ app on my phone. I understand this is expected as the panel does not allow simultaneous connections. When the plug-in fails I can re-establish a connection with the RSC+ app. Expected behaviour but noting in case it helps with narrowing down causes.

EHylands commented 1 year ago

This issue is very confusing ...

Either connection is timing out on network link or interface problem unrelated to Homebridge and plugin is unable to reconnect to Solution Panel.

Pooling loop may also be interrupted and socket times out by lack of network activity (no confidence message in legacy mode)

Finally, timeout may only be a consequence of whole plugin/child bridge crashing ...

I still can't explain why the reconnection process is not successful. I tested the reconnection sequence on my B Panel that allows for more than one simultaneous connection. Maybe we need to wait more than 60 sec on Solution Panels before reconnecting to make sure that the initial socket on panel has been properly destroyed.

I have never used child bridge option before. Just set up my devel Homebridge server with plugin child bridge enabled. Will be looking for crash, timeout and memory leak !

Is it possible to try reproducing the issue with your Homebridge server on a wired connection ?

sanjay900 commented 1 year ago

Interestingly, i have had people report similar issues with my home assistant addon in the past, where their connections just die after a while and a reboot of home assistant is needed. If you do work this one out let me know i'd be interested if it is something panel related!

EHylands commented 1 year ago

I think I also found an issue with plugin reconnection when plugin is run in legacy mode with data pooling. Will do further testing and report back.

EHylands commented 1 year ago

@brycetaplin I still can't explain why your connection between the plugin and panel is timing out.

I was able to reproduce why the plugin was not automatically reconnecting to panel in Legacy Mode after connection had timed out. (I was not clearing the command queue in the reconnection sequence).

You can install alternate beta version 0.7.1 to see if your panel now automatically reconnects to panel after failure.

Please let me know afterward !

brycetaplin commented 1 year ago

Hi @EHylands. Thanks for that. I will try the beta version now. I've left the plug-in running for a few days now since the last connection was lost and it has not automatically reconnected. I was going to try a panel re-start but I will do the new plug-in version now and will report back once I see how it performs.

brycetaplin commented 1 year ago

Hi @EHylands. Status update, plug-in has not lost connection with panel since installing 0.7.1. I will keep waiting for a lost connection to see if it will reconnect, but the behaviour is a little odd. Previously it went for at most 5 hrs before losing connection. Will keep you updated.

brycetaplin commented 1 year ago

Hi @EHylands. Got a panel connection error this morning and it reconnected almost immediately with 0.7.1. So while this doesn't answer why it loses connection, it at least re-establishes it.

[15/02/2023, 08:19:53] [homebridge-boschcontrolpanel_bgseries] Control Panel Connection Error (Timeout)
[15/02/2023, 08:20:53] [homebridge-boschcontrolpanel_bgseries] Trying to reconnect ....
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] Bosch Control Panel Information
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] -----------------------------------------
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] Panel Type: Solution3000
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] Firmware: 2.0
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] RPS Version: 5.2.0
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] Intrusion Protocol Version: 2.3.0
[15/02/2023, 08:21:03] [homebridge-boschcontrolpanel_bgseries] Execute Protocol Version: 2.22.0

I do note that occasional disconnections for other plug-ins appear normal in my HomeKit set up.

Let me know if you want any other information from me on this.

EHylands commented 1 year ago

I'm glad we could solve the reconnection issue.

For the timeout issue, Homebridge needs to maintain a persistent tcp connection to panel. I don't know how resilient that connection would be in a saturated wifi environment. I didn't have any problem myself running my dev Homebridge server over wifi for a few hours, but I get barely any interference from other devices or neighbours networks.

As a first step, I would try running Homebridge over a wired connection to make sure the issue is not wifi related.

The other possibility is the pooling routine stalling and causing the connection to timeout by lack of network activity. I can write a watchdog function detecting any stall and relaunching the pooling process after 5-10 sec without receiving an update.