openhab / openhab-addons

Add-ons for openHAB
https://www.openhab.org/
Eclipse Public License 2.0
1.86k stars 3.57k forks source link

[homematic] Could not reinitialize the device #7019

Closed nRaecheR closed 2 years ago

nRaecheR commented 4 years ago

I'm trying to bind my fully working Homematic installation to my OpenHab 2.50-1 running on a RPI3. OpenHab2 is running without issues a lot of other bindings (Hue, Weather, ZWave etc.) but I have major problems with the Homematic binding. I've added the bridge and created an item "Duty Cycle", it updates frequently and normally. When I add another HmIP thing, e.g. a HmIP-SWDO I experience problems. After some hours the bridge will marked as offline and the switch item is offline too. Even the web interface of the Homematic bridge will be slow as a snail and every action takes seconds.

The Homematic Bridge is RaspberryMatic running 3.49.17.20200131 (latest version). 192.168.46.9 ist the OpenHAB2 installation, 192.168.46.8 RaspberryMatic

Authentication on RaspberryMatic is disabled, the firewall too, all ports are open.

Here are the log entries from OpenHab2:

2020-02-16 15:38:51.233 [WARN ] [ternal.handler.HomematicThingHandler] - Could not reinitialize the device '00109709ADBCC0': Total timeout 15000 ms elapsed
java.io.IOException: Total timeout 15000 ms elapsed
        at org.openhab.binding.homematic.internal.communicator.client.XmlRpcClient.sendMessage(XmlRpcClient.java:98) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.client.XmlRpcClient.sendMessage(XmlRpcClient.java:102) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.client.XmlRpcClient.sendMessage(XmlRpcClient.java:67) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.client.RpcClient.setChannelDatapointValues(RpcClient.java:201) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.AbstractHomematicGateway.setChannelDatapointValues(AbstractHomematicGateway.java:517) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.CcuGateway.setChannelDatapointValues(CcuGateway.java:108) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.AbstractHomematicGateway.loadChannelValues(AbstractHomematicGateway.java:487) ~[?:?]
        at org.openhab.binding.homematic.internal.handler.HomematicThingHandler.loadHomematicChannelValues(HomematicThingHandler.java:397) ~[?:?]
        at org.openhab.binding.homematic.internal.handler.HomematicThingHandler.updateStatus(HomematicThingHandler.java:415) ~[?:?]
        at org.openhab.binding.homematic.internal.handler.HomematicThingHandler.deviceLoaded(HomematicThingHandler.java:578) ~[?:?]
        at org.openhab.binding.homematic.internal.handler.HomematicBridgeHandler.onDeviceLoaded(HomematicBridgeHandler.java:342) ~[?:?]
        at org.openhab.binding.homematic.internal.handler.HomematicBridgeHandler.onNewDevice(HomematicBridgeHandler.java:304) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.AbstractHomematicGateway.newDevices(AbstractHomematicGateway.java:769) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.server.RpcResponseHandler.handleNewDevice(RpcResponseHandler.java:108) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.server.RpcResponseHandler.handleMethodCall(RpcResponseHandler.java:57) ~[?:?]
        at org.openhab.binding.homematic.internal.communicator.server.XmlRpcServer$ResponseHandler.handle(XmlRpcServer.java:125) ~[?:?]
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) ~[?:?]
        at org.eclipse.jetty.server.Server.handle(Server.java:494) ~[?:?]
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:374) ~[?:?]
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:268) ~[?:?]
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) ~[?:?]
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103) ~[?:?]
        at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117) ~[?:?]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336) ~[bundleFile:9.4.20.v20190813]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313) ~[bundleFile:9.4.20.v20190813]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171) ~[bundleFile:9.4.20.v20190813]
        at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135) ~[bundleFile:9.4.20.v20190813]
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782) [bundleFile:9.4.20.v20190813]
        at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918) [bundleFile:9.4.20.v20190813]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_222]
Caused by: java.util.concurrent.TimeoutException: Total timeout 15000 ms elapsed
        at org.eclipse.jetty.client.TimeoutCompleteListener.onTimeoutExpired(TimeoutCompleteListener.java:51) ~[?:?]
        at org.eclipse.jetty.io.CyclicTimeout$Wakeup.run(CyclicTimeout.java:282) ~[?:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_222]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_222]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_222]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_222]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_222]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_222]
        ... 1 more

The RaspberryMatic add these entries to the hmserver.log:

Feb 16 15:37:55 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-1] (un)registerCallback on LegacyServiceHandler called from url: http://192.168.46.9:9125 
Feb 16 15:37:55 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-1] Removed InterfaceId: HMIP-3014F711A000041709ADFA65 
Feb 16 15:37:55 de.eq3.ccu.virtualdevice.service.internal.rega.VirtualDeviceHandlerRega INFO  [vert.x-eventloop-thread-5] (un)registerCallback on VirtualDeviceHandlerRega called from url: http://192.168.46.9:9125 
Feb 16 15:37:55 de.eq3.ccu.virtualdevice.service.internal.rega.VirtualDeviceHandlerRega INFO  [vert.x-eventloop-thread-5] Removed InterfaceId: null 
Feb 16 15:37:55 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer ERROR [vert.x-worker-thread-3] IO Exception: Could not add interface: HMIP-3014F711A000041709ADFA65 
de.eq3.cbcs.legacy.communication.rpc.RpcIOException: org.apache.http.NoHttpResponseException: 192.168.46.9:9125 failed to respond
        at de.eq3.cbcs.legacy.communication.rpc.internal.transport.http.HttpTransport.sendRequest(HttpTransport.java:110)
        at de.eq3.cbcs.legacy.communication.rpc.internal.rpc.RpcClient.sendRequest(RpcClient.java:94)
        at de.eq3.cbcs.legacy.communication.rpc.internal.rpc.RpcClient.invoke(RpcClient.java:82)
        at com.sun.proxy.$Proxy22.newDevices(Unknown Source)
        at de.eq3.cbcs.legacy.bidcos.rpc.internal.LegacyBackendClient.newDevices(LegacyBackendClient.java:157)
        at de.eq3.cbcs.legacy.bidcos.rpc.internal.DeviceUtil.synchronizedBackendDevices(DeviceUtil.java:215)
        at de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer.handle(InterfaceInitializer.java:109)
        at de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer.handle(InterfaceInitializer.java:26)
        at io.vertx.core.eventbus.impl.HandlerRegistration.deliver(HandlerRegistration.java:212)
        at io.vertx.core.eventbus.impl.HandlerRegistration.handle(HandlerRegistration.java:191)
        at io.vertx.core.eventbus.impl.EventBusImpl.lambda$deliverToHandler$3(EventBusImpl.java:505)
        at io.vertx.core.impl.ContextImpl.lambda$wrapTask$2(ContextImpl.java:337)
        at io.vertx.core.impl.TaskQueue.lambda$new$0(TaskQueue.java:60)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.http.NoHttpResponseException: 192.168.46.9:9125 failed to respond
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)
        at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)
        at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:261)
        at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283)
        at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:259)
        at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:232)
        at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:272)
        at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:124)
        at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:686)
        at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:488)
        at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:884)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:107)
        at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at de.eq3.cbcs.legacy.communication.rpc.internal.transport.http.HttpTransport.sendRequest(HttpTransport.java:106)
        ... 15 more
Feb 16 15:38:04 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-1] (un)registerCallback on LegacyServiceHandler called from url: http://192.168.46.9:9125 
Feb 16 15:38:04 de.eq3.cbcs.legacy.bidcos.rpc.LegacyServiceHandler INFO  [vert.x-worker-thread-3] init finished 
Feb 16 15:38:04 de.eq3.cbcs.legacy.bidcos.rpc.internal.InterfaceInitializer INFO  [vert.x-worker-thread-2] Added InterfaceId: HMIP-3014F711A000041709ADFA65 
Feb 16 15:38:04 de.eq3.ccu.virtualdevice.service.internal.rega.VirtualDeviceHandlerRega INFO  [vert.x-eventloop-thread-5] (un)registerCallback on VirtualDeviceHandlerRega called from url: http://192.168.46.9:9125 
Feb 16 15:38:04 de.eq3.ccu.virtualdevice.service.internal.rega.VirtualDeviceHandlerRega INFO  [vert.x-eventloop-thread-5] Added InterfaceId: GROUP-3014F711A000041709ADFA65 
Feb 16 15:38:04 de.eq3.ccu.virtualdevice.service.internal.rega.BackendWorker INFO  [vert.x-worker-thread-18] Execute BackendCommand: de.eq3.ccu.virtualdevice.service.internal.rega.BackendUpdateDevicesCommand 
Feb 16 15:38:04 de.eq3.cbcs.legacy.bidcos.rpc.internal.DeviceUtil INFO  [vert.x-worker-thread-2] updateDevicesForClient HMIP-3014F711A000041709ADFA65 -> 289 device addresses will be added 

I had to remove the item to get the RaspberryMatic working as expected. The bridge only gets online after restarting the OpenHAB2 RPi.

Any ideas how to fix this issue and make the Homematic binding working?

MHerbst commented 4 years ago

I have a similar configuration but without any problems. Are you only using HmIP components or do you have some "classic" devices, too? Can you check the firewall settings on you openHAB Raspi and make sure that the ports 9125 and 9126 are open.

nRaecheR commented 4 years ago

The Firewall is set to All ports open and it works for days when I just poll the bridge item Duty Cycle. So the communication itself shouldn't be the problem. I've around 25 devices connected, all are HmIP wireless.

MHerbst commented 4 years ago

As far as I know, the request to retrieve the duty cycle works differently because it is a system variable. That means the binding has to send requests to the CCU to get the value.

Communication with "real" devices works differently. Then if a data point's value changes the CCU sends an event to the binding. Can you enable the TRACE log output and add at least one device? Then please restart openHAB and wait for the communication error. Hopefully, the trace output gives us the necessary information.

nRaecheR commented 4 years ago

Okay, I presumed you mean the communication from OpenHAB2 to RaspberryMatic. This should be the same (IP, Ports, Protocols) for DutyCyle and other items. Isn't it?

I'll enable the trace as soon as there is another maintenance window later in the week. Thank you.

nRaecheR commented 4 years ago

Here is the log to a case where the HmIP device won't leave the uninitialized state:

https://athena.bockholdt.com/s/c9HEeqYNtJfd5zj

The device is the device with ID 0001D3C99C310F

MHerbst commented 4 years ago

Hmm, I can't see any error message in the log and no messages about status changes. The binding received a last event for the mentioned device at 10:45:29. After that time there are only ping/pong messages that are used by the binding to test for the availability of the CCU and events from other devices. I can be some sort of timing issue. The mentioned device has 1.200 data points. I need to look closer into the code to understand what's happening here.

nRaecheR commented 4 years ago

Yes, you're right. The symptoms for this specific device are the same as described in my first post but the bridge itself and the other configured device is still functioning. The error messages haven't appeared too.

Thank you, Martin.

nRaecheR commented 4 years ago

It seems that the HM-Binding has a problem with a disappearing Homematic Bridge (e.g. a Restart of it). The Bridge thing will be marked as offline and stays offline until restarting OpenHAB2.

MHerbst commented 4 years ago

Normally, a restart of the CCU should not be a problem. The binding sends periodic "ping" requests to the CCU and the CCU answers with "pong". (you can see them in the openhab.log). After the CCU was restarted it answers again with "pong" responses and then the Bridge should become online. This works perfectly in my environment.

dmalorny commented 2 years ago

I have the exact same issues. I have a setup with around 50 HM classic devices and 3 HmIP devices. Only the three HmIP devices are causing the exception.

RaspberryMatic v. 3.61.7.20220115 on Raspberry Pi 3 OpenHAB v. 2.5.2 as Docker-Container on Synology NAS

MHerbst commented 2 years ago

There haven been several improvements in OH 3.2. Therefore, I highly recommend upgrading your OH installation.

nRaecheR commented 2 years ago

Yes, OH 3.1 fixed the HM-Bridge Online status, during a reboot of RaspberryMatic OH detects the change of status with a OFFLINE and then a ONLINE message. But the things/items are staying offline after that, or at least, they won't receive any updates to their states. So I still need to restart OH after a reboot/update of RaspberryMatic.

dmalorny commented 2 years ago

Had several problem when I first tried switching from OH 2.5 to OH 3.1:

But It looks like I have to try it again with OH 3.2...

MHerbst commented 2 years ago

But the things/items are staying offline after that, or at least, they won't receive any updates to their states.

This has been improved in 3.2. Maybe not completely because some HmIP devices behave a bit strange ...

nRaecheR commented 2 years ago

I haven't seen any real improvement after updating from OH 3.1 to 3.2, the reboot problem still remains. Here're my HmIP devices that I've connected to my RM instance:

14x HmIP-BROLL 1x HmIP-FROLL 4x HmIP-BBL
3x HmIP SWDO-I 3x HmIP-BDT 1x HMIP-PSM 1x HMIP-PS 2x HmIP-SWD 1x HmIP-DSD-PCB 1x HmIP-SWO-PR

MHerbst commented 2 years ago

That's strange. Normally there are only one or two devices with problems. Please create a TRACE log and attach it here.

nRaecheR commented 2 years ago

Here it is, the TRACE of a reboot of RM seen from my OH 3.2 instance:

https://athena.bockholdt.com/s/yGEPKy7NZaN4HZL

Thank you in advance!

MHerbst commented 2 years ago

The trace looks a bit strange. There is a mixture of requests that are executed without any problems and then timeouts for other requests. Looks like timing issues. These problems are nearly impossible to debug because every execution looks different, and it's not really funny if you have to restart the CCU (or RM) permanently ...

Meanwhile, I have the feeling that the issue can only be solved by rewriting the connection handling. With every attempt to solve the problem, the coding gets more complicated.

BTW: on what kind of machine is your OH installation running?

nRaecheR commented 2 years ago

Ah, okay, I understand. The binding seems to be generally unreliable. I've got three HmIP-BDT, one of them is working fine with OH 3.2, the others won't work at all. I can add them as things but they won't follow any command I've send through OH. Same with the HmIP-BBLs ("Raff Stores"), I had to reset the actors to factory default after trying to add them to OH. Last time I tried that was with OH 3.0 though.

My OH installation is currently running version 3.2 Release, Openhabian, on a Raspberry Pi 3 Model B Rev 1.2. But I would like to migrate it to one of my amd64 servers as Docker container in the near future.

MHerbst commented 2 years ago

Your configuration is similar to mine. OH is also running on an Raspi 3 but I only have got 4 HmIP devices. All other devices are non IP devices. The problems regarding the re-connect only occur with HmIP devices. In my opinion, the cause of the problems is not only the binding but also the service for the HmIP devices.

Regarding your issues with the BDT and BBL devices. As far as I remember there have been some fixes for these devices and I don't know about any current problems.

nRaecheR commented 2 years ago

I can live with the reboot, it's inconvenient but it works after a reboot. And I will give it another try with the BDT and BBL devices with OH 3.2. Thank you !

hmerk commented 2 years ago

closed due to inactivity