jens-maus / RaspberryMatic

:house: A feature-rich but lightweight, buildroot-based Linux operating system alternative for your CloudFree CCU3/ELV-Charly 'homematicIP CCU' IoT smarthome central. Running as a pure virtual appliance (ProxmoxVE, Home Assistant, LXC, Docker/OCI, Kubernetes/K8s, etc.) or on a dedicated embedded device (RaspberryPi, Tinkerboard, IntelNUC, etc.)
https://raspberrymatic.de
Apache License 2.0
1.5k stars 184 forks source link

hs485d Bidcos Wired doesn´t reconnect after connection failed. #2737

Closed topperharly closed 1 month ago

topperharly commented 1 month ago

Describe the issue you are experiencing

Sometimes (i don´t know why) Raspberrymatic hs484d lost connection to HS485ControllerLGW (Bidcos Wired) The lost of the Connection "is not part" of this bug.

But the daemon is not able to reconnect.

I ignore this issue for more than two years, and periodically restart my Raspberry Pi 3+

But in the last weeks i lost Connection to my Wired Components each day. So i analyzed this issue a little deeper.

In conclusion this Issue is not only related to the current version of RM.

Describe the behavior you expected

After device is available again, reconnection should be done automatically and wired status should be switch back to "online"

Steps to reproduce the issue

  1. Enable debug Logging of hs485d change in /etc/init.d/S60hs485d Loglevel from 1 to 5 2.Restart hs485d /etc/init.d/S60hs485d restart

  2. force a disconnect of the LAN Gateway... (pull LAN Cable) wait for following Messages in /var/log/message

    May 11 00:14:44 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send 1 events
    May 11 00:14:44 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send completed
    May 11 00:15:14 homematic-raspi user.debug hs485d: Event: CENTRAL.PONG="nr"
    May 11 00:15:14 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send 1 events
    May 11 00:15:14 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send completed
    May 11 00:15:21 homematic-raspi user.err hs485d: response timeout
    May 11 00:15:21 homematic-raspi user.err hs485d: HS485ControllerLGW::keepAliveMsgThreadFunction(): Did not get an answer
    May 11 00:15:34 homematic-raspi user.debug hs485d: LGWPortWrapper::reconnect(): Unable to find device with serial 'YOUR Serial Device ID
  3. Reconnect LAN Cable to LanGateway

Device is available, ping able via FIXED IP, founded also in the Netfinder Tool. But hs485d still throws messages like this:

May 11 03:55:46 homematic-raspi user.debug hs485d: Event: CENTRAL.PONG="nr"
May 11 03:55:46 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send 1 events
May 11 03:55:46 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send completed
May 11 03:56:16 homematic-raspi user.debug hs485d: Event: CENTRAL.PONG="nr"
May 11 03:56:16 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send 1 events
May 11 03:56:16 homematic-raspi user.debug hs485d: HSSXmlRpcEventDispatcher::Handle send completed
May 11 03:56:38 homematic-raspi user.debug hs485d: LGWPortWrapper::reconnect(): Unable to find device with serial PEQ18

What is the version this bug report is based on?

3.75.7.20240420

Which base platform are you running?

rpi3 (RaspberryPi3, ARM64/aarch64)

Which HomeMatic/homematicIP radio module are you using?

n/a

Anything in the logs that might be useful for us?

the above in the description how to reproduce

Additional information

As workaround it is only necessary to restart hs485d to work proper again, and reconnect to the LGW

So why not implement a monitrc Check... When bidcos.wired Status (i think it is stored in /var/status/PEQ${ID}.constat switch to "ERROR" from "NO_ERROR", restart (max 5 Times with 30 sec. Retry Pause) hs485d

This is only a Workaround, no fix for the root cause of not reconnection, but i thin it is easy to implement

I can provide a PR for that, if that helps

topperharly commented 1 month ago

After sending this BUG Report i´ve recognized these Messages in the LOG:

Lan Device Information: Protocol-Version: 1 Product-ID: eQ3-HMW-LGW Firmware-Version: 1.0.5 Serial Number: PEQ1856177
 LGWPortWrapper::connect(): Desired and determined serial numbers do not match. Storing desired serial number: PEQ1856711

So i ´ve correct the Number rolling in the Serial ID and checked again...

image

Reconnect WORKS as expected.. when there is no Number Scolling / Typo in the Serial Number