openhab / openhab-addons

Add-ons for openHAB
https://www.openhab.org/
Eclipse Public License 2.0
1.88k stars 3.59k forks source link

Velux binding should be able to reboot KLF 200 device #8462

Closed EjvindHald closed 3 years ago

EjvindHald commented 4 years ago

I suggest that the Velux binding is expanded, so it can reboot the KLF 200 device. This is becuase the KLF 200 device is not always stable and needs a power recycling with regular intervals.

It seems that a reboot api is available in the SLIP protocol - search for GW_REBOOT_REQ. I guess the binding should then go in wait and after a while perform a login again when KLF 200 is ready. Thanks.

andrewfg commented 4 years ago

Hmm. I was also thinking about implementing this myself. The reason for not doing so is -- chicken and egg -- if the KLF is in a locked up state where it needs rebooting, then it is in a state where it cannot respond to commands like GW_REBOOT_REQ !!

However, I made a lot of recent improvements to the binding -- as described here -- in order to prevent the KLF ever getting into a locked up state (see points 3, 4, 5 in the pull request), so this might "solve" your problem.

EjvindHald commented 4 years ago

Thank you very much for your contribution - it is highly appreciated.

I am aware that it does not help, if the device is already locked. However, with binding version 2.5.5 I experienced quite consistently a freeze every approx. 10 days. In such a scenario the functionality would allow a daily or weekly reboot to fix it.

I hope the recent updates will make this functionality not needed.

andrewfg commented 4 years ago

Ok @EjvindHald I understand what you are saying.

Personally I would like to invest time in trying to prevent the KLF from locking up, before investing time in developing a work around.

Therefore I would be delighted if you can help me by testing the new version, and giving me feedback (if your lock ups still persist) concerning details of your setup and the respective logs etc.

And if, finally, we together cannot fix it, then I can always implement the reboot command as you suggest.

EjvindHald commented 4 years ago

I have been running 2.5.9 for some time, and it seems to be better. However, I cannot tell for sure.

Sometimes in 2.5.5 I have also experienced it after a restart of OH, where it cannot login to the Velux device. Only a restart of KLF200 could fix that.

I can see your stuff is now in 2.5.10, so I will upgrade to that version soon.

andrewfg commented 4 years ago

Hi @EjvindHald many thanks for the feedback..

I can see your stuff is now in 2.5.10, so I will upgrade to that version soon.

The version that is in the official v2.5.10 is in fact earlier than my current modifications (sorry for the confusion). In fact the real latest version is on the link below. (You need to uninstall the official version using PaperUI, and drop the below mentioned jar into your addons folder).

https://github.com/andrewfg/openhab-addons/blob/velux-backport/bundles/org.openhab.binding.velux/org.openhab.binding.velux-2.5.10-SNAPSHOT.jar

andrewfg commented 3 years ago

Hi @EjvindHald I added a "Rule Action" rebootBridge() to the latest build (you can get the new JAR file in the same link as above). I would appreciate if you could test it before I officially add it to a Pull Request. -- Note: use carefully!

rule "Reboot KLF 200"
when
    ...
then
    val bridgeActions = getActions("velux","velux:klf200:myhubname")
    val isRebooting = bridgeActions.rebootBridge()
end
EjvindHald commented 3 years ago

Thanks. However, the binding seems to be very stable now, so there is no need for reeboting anymore. Is the reboot supposed to be triggered by a new rule in the rule file as you showed? Nothing else is required?

andrewfg commented 3 years ago

Is the reboot supposed to be triggered by a new rule in the rule file as you showed? Nothing else is required?

Correct.

EjvindHald commented 3 years ago

This is a really cool feature. However, I am a bit busy these days, so I will try this Friday evening or in the weekend. Thanks.

andrewfg commented 3 years ago

Ok @EjvindHald please let me know when you have had time to run a test. It works for me, but it would be good to have independent confirmation..

EjvindHald commented 3 years ago

Hi I have now tested it, and the functionality seems to be working :-)

However, I have a problem with a message like this below appears every approx 2 minutes and is filling up my log. 2020-11-14 12:49:40.062 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'velux:klf200:192_168_1_15' to inbox.

My Velux KLF200 has ip 192.168.1.15 and everything is described in a Things file untouched from previously Velux binding.

I have tried to reset cache and tmp folders with no result, and I cannot figure out how to stop this message. If it cannot be stopped, I need to revert to the old binding.

Also, I get this warning from the rules file, which I would like not to have: 2020-11-14 12:41:21.542 [INFO ] [el.core.internal.ModelRepositoryImpl] - Validation issues found in configuration model 'velux.rules', using it anyway: The value of the local variable isRebooting is not used

I am running OH 2.5.9 in a Docker container.

Please remember to include a final rule example in the documentation - otherwise the functionality is not known.

Thanks.

andrewfg commented 3 years ago

@EjvindHald thank you for the feedback..

a message like this below appears every approx 2 minutes and is filling up my log. My Velux KLF200 has ip 192.168.1.15 and everything is described in a Things file untouched from previously Velux binding.

This is the new auto discovery feature for finding KLF hubs. However if a Thing is already instantiated, it should not 'discover' it again. => Can you please send me your Things definition?

Also, I get this warning from the rules file .. The value of the local variable isRebooting is not used

Then don't declare the variable. :)

EjvindHald commented 3 years ago

This is my things file: Bridge velux:klf200:home [ ipAddress="192.168.1.15", tcpPort=51200, password="velux123", timeoutMsecs=4000, retries=10 ] { Thing actuator ThKLF200Vindue "Velux - Vindue" @ "Velux" [serial="53:07:09:26:0C:0D:00:5F"] Thing actuator ThKLF200Rulleskodde "Velux - Rulleskodde" @ "Velux" [serial="56:23:3D:5A:0A:21:1A:46"] Thing actuator ThKLF200Gardin "Velux - Gardin" @ "Velux" [serial="53:14:1E:32:13:25:04:39"] }

and I simply copied your rule in. So I guess I need an example of how to invoke the reboot without declaring the variable as in your example. Thanks.

andrewfg commented 3 years ago

Thanks for the Things file; I will look into it..

I guess I need an example of how to invoke the reboot without declaring the variable

rule "Reboot KLF 200"
when
    ...
then
    val bridgeActions = getActions("velux","velux:klf200:myhubname")
    bridgeActions.rebootBridge()
end

or

rule "Reboot KLF 200"
when
    ...
then
    val bridgeActions = getActions("velux","velux:klf200:myhubname")
    val isRebooting = bridgeActions.rebootBridge()
    logWarn("Rules", "Rebooting: " + isRebooting)
end
andrewfg commented 3 years ago

PS I am really intrigued that anything works with the default wifi password?? Or is that just to hide the real one?? But please don't send me your actual password!!

image

andrewfg commented 3 years ago

PPS Between the latest v2.5.10-SNAPSHOT jar and the official v2.5.9 release jar, the thing-description-xml file had to be changed to include a representation property whose purpose is precisely to prevent duplicated discovery entries in the Inbox. In my observation it can take the OH core some time to notice the thing-description-xml change and update itself. => So could you let it run for a while, and let me know if the problem goes away on its own?

andrewfg commented 3 years ago

PPPS it might need a reboot..

andrewfg commented 3 years ago

... and, if you have the v2.5.10-SNAPSHOT jar in the add-ons folder and also have the official v2.5.9 jar loaded via PaperUI I suppose it is possible that the OH core could get confused about which thing-description-xml file it should be using (???) => To make it clean, I would recommend using Paper UI to uninstall the v2.5.9 version, and then drop the v2.5.10-SNAPSHOT in the add-ons folder, and finally do a reboot..

EjvindHald commented 3 years ago

Thanks for your quick reply.

I did exactly as you mentioned. Removed it in PaperUI and afterwards added in addons folder. It was yesterday evening with numerus reboot and it ran all night logging the same error every approx 2 minutes. I have also rebooted today.

Using Karaf console I verified that I have only one correct Velux instance running by using this command: bundle:list org.openhab.binding.velux

Can I manually edit or verify this thing-description-xml file? I am using 2.5.9 while you are probably using 2.5.10 - is that relevant?

EjvindHald commented 3 years ago

During startup of OH I see these 3 messages: `2020-11-14 16:39:08.865 [INFO ] [.internal.handler.VeluxBridgeHandler] - Initializing Velux Bridge 'velux:klf200:home'.

2020-11-14 16:39:08.973 [INFO ] [nternal.bridge.slip.io.SSLconnection] - Shutting down Velux bridge connection.

2020-11-14 16:39:09.015 [INFO ] [nternal.bridge.slip.io.SSLconnection] - Starting velux bridge connection.`

I am just surprised there is shut down of the connection?

EjvindHald commented 3 years ago

Also every second minutes these 2 messages appear in the event log: `2020-11-14 16:45:03.646 [me.event.InboxRemovedEvent] - Discovery Result with UID 'velux:klf200:192_168_1_15' has been removed.

2020-11-14 16:45:21.057 [home.event.InboxAddedEvent] - Discovery Result with UID 'velux:klf200:192_168_1_15' has been added. `

Please note approx 15-20 seconds between.

andrewfg commented 3 years ago

During startup of OH I see these 3 messages: I am just surprised there is shut down of the connection?

EDIT: Don't worry about this message, it means nothing. Before I open a new connection I always call the close() method to make sure any past connection is completely cleared away. In the case of a new start, there is nothing to clear away, so the message should not be logged. I will change this in the next build.

Also every second minutes these 2 messages appear in the event log:

Ok. That is another symptom of the same problem.

Can I manually edit or verify this thing-description-xml file?

No. It is embedded inside the jar file.

I am using 2.5.9 while you are probably using 2.5.10 - is that relevant?

No.

EjvindHald commented 3 years ago

I am running OH in Docker - not as a ordinary service. So stopped the container for 2 minutes and started again, but everything remains the same. Please note 2.5.9 Velux binding does not appear in PaperUI as installed. image

andrewfg commented 3 years ago

Ok. The ball is in my court. Let me think about it overnight..

andrewfg commented 3 years ago

@EjvindHald what settings do you have for the Inbox?

image

andrewfg commented 3 years ago

... and what values do you see for a) the Config Param "ipAddress" versus b) the Property "ipAddress" ?? ..

Snag_f80968e

Snag_f8112f2

EjvindHald commented 3 years ago

The inbox settings are identical to your example. See below: image

Both the configuration and property ip address is 192.168.1.15. So they are identical as in your example. image image

andrewfg commented 3 years ago

@EjvindHald many thanks for helping me to solve this problem.

Currently I am "flummoxed"...

Background: The OH core uses something called a RepresentationProperty as a tag to prevent duplicate discovery entries from being added to the Inbox. This tag must be a unique attribute of the bridge (in case there may be more than one bridge) so I am using the IP address as the RepresentationProperty. The OpenHAB core uses some auto-magic code that checks if the Thing has either 1) a matching Config Param with the same name as the RepresentationProperty, or 2) a matching Property with the same name as the RepresentationProperty, and if either of these applies, and the respective Param/Property value matches then duplicates should not be added.

Now the Velux binding (oddly) has both a Config Param and a Property with the same name, so it should be twice happy that the newly found Thing is a duplicate. :) And indeed that is the case on my system. But not apparently on yours. I am guessing that the issue is indeed because the binding has both a Config Param and a Property with the same name.

Could you please tell me what you see for the JSON entry for the KLF200 in your org.eclipse.smarthome.config.discovery.DiscoveryResult.json file? Mine is shown below..

  "velux:klf200:192_168_1_142": {
    "class": "org.eclipse.smarthome.config.discovery.internal.DiscoveryResultImpl",
    "value": {
      "thingUID": {
        "segments": [
          "velux",
          "klf200",
          "192_168_1_142"
        ]
      },
      "thingTypeUID": {
        "segments": [
          "velux",
          "klf200"
        ]
      },
      "properties": {
        "ipAddress": "192.168.1.142"
      },
      "representationProperty": "ipAddress",
      "flag": "IGNORED",
      "label": "Velux Bridge (192.168.1.142)",
      "timestamp": 1605372993408,
      "timeToLive": -1
    }
  },
EjvindHald commented 3 years ago

Thanks again. I found the file, but there is no entry for Velux at all. Only for Hue, which is also in my inbox.

Can I manually add an entry based on the same template as yours? Not sure about the timestamp though.

In PaperUI it does not appear in my inbox.

If you are interested I can text you a VPN login code + I can create a temporary userid for you.

andrewfg commented 3 years ago

there is no entry for Velux at all

Hmm. I have been digging into the source code of the OH core. The JSON file is the image on disk of the discovery Inbox database, so there is conflicting news..

It seems your inbox database is a cousin of Schroedinger's cat -- the bridge is both in, and not in, it. :)

I wonder if your database is corrupted, or cached, or something? (You could perhaps try deleting the file, and/or its analogs in the backup sub folder. But do please keep a copy of those files "just in case").

EjvindHald commented 3 years ago

Do you mean stop OH, delete the file org.eclipse.smarthome.config.discovery.internal.DiscoveryResultImpl and start OH again?

In Karaf I have already tried "smarthome:inbox clear" which empty the inbox. But I can try to remove the file completely as well.

EjvindHald commented 3 years ago

I have just done as shown, and it does not change anything... The message shown is: 2020-11-15 15:35:24.790 [INFO ] [g.discovery.internal.PersistentInbox] - Added new thing 'velux:klf200:192_168_1_15' to inbox.

But the Velux does not appear in the Inbox in PaperUI.

Can it be in another json file or similar in OH, where the contents is corrupt?

andrewfg commented 3 years ago

the Velux does not appear in the Inbox in PaperUI

Just for the avoidance of doubt, did you press the SHOW IGNORED button?

EjvindHald commented 3 years ago

Good catch - you were complete right. I must have pressed this one at some point.

This means it is now showing in Inbox, and it is also in the file, which looks like this:

"velux:klf200:192_168_1_15": { "class": "org.eclipse.smarthome.config.discovery.internal.DiscoveryResultImpl", "value": { "thingUID": { "segments": [ "velux", "klf200", "192_168_1_15" ] }, "thingTypeUID": { "segments": [ "velux", "klf200" ] }, "properties": { "ipAddress": "192.168.1.15" }, "representationProperty": "ipAddress", "flag": "NEW", "label": "Velux Bridge (192.168.1.15)", "timestamp": 1605457404940, "timeToLive": -1 } },

But I still do not get, why it arrives in this file. It IS defined in a Things file already ???

andrewfg commented 3 years ago

But I still do not get, why it arrives in this file. It IS defined in a Things file already ???

Well it should be there; because it has been discovered. But it should have the flag ‘ignored’; because there is an instantiated Thing already. And it should not be being Added and Removed repeatedly. I am still puzzled about this..

andrewfg commented 3 years ago

@EjvindHald am I correct in saying that the Inbox clearing did not stop the repeating Added/Removed log messages? Or is it fixed now?

EjvindHald commented 3 years ago

No, it is still the same with messages every approx. 2 minutes.

EjvindHald commented 3 years ago

It is added and removed from Inbox every 2 minutes. And when it is added, it has status ignored. So I can oonly see it by pressing “Show Ignored”.

andrewfg commented 3 years ago

You say it is “ignored” in the Inbox, but yet your JSON File has the "flag": "NEW" — is that correct?

PS I am beginning to think that there is a bug in the OH core..

EjvindHald commented 3 years ago

Yes, that is correct.

How about adding the option to disregard Discovery as it is done in the RFXCOM binding?

See screen shot from PaperUI from that binding: image

andrewfg commented 3 years ago

I am not convinced that would be a good idea. I prefer to solve the problem rather than hide it.

andrewfg commented 3 years ago

@EjvindHald I could not find anything dramatically wrong in the Velux binding code, but nevertheless I made a few tweaks that just might resolve your issue. So I would appreciate it if you could test it.

The new build is HERE

And if it does not resolve your issue, could you kindly turn on logging for the OH core discovery module (as follows) and send me the results?

log:set TRACE org.openhab.core.config.discovery
EjvindHald commented 3 years ago

Thanks. I downloaded the new build and replaced it in the folder. That forced a restart of the binding in the openhab.log file, which is fine. After that I started the TRACE using Karaf console. However, this seems to have no impact at all - no additional messages from discovery is given in either openhab.log or events. log. Only messages identical to those I already published in this thread already with removal from inbox and later add to in-box.

Unfortunately, the new version does not changes anything. I also tried to reboot openHAB, but the result is still the same.

EjvindHald commented 3 years ago

This evening I have tried all kinds of things including removing / add Thing file, clear cache while openhab is down, openhab rest api to verify status on Things. Regardless of what I am doing it keeps add to inbox in intervals of 2-7 minutes.

I have also tried to disable AutoDiscovery for Velux using Karaf console, but I cannot make it work.

I agreed that the error should not be hidden as you wrote. I do not know how much effort it is to add such as switch in PaperUI to enable/diasable AutoDiscovery, but it would be nice to have in this interim period, where the root cause has not been identified.

andrewfg commented 3 years ago

@EjvindHald thank you for your help. Here's the thing: There are two processes 1) something is adding the bridge to the Inbox, and 2) something (else) is removing it from the Inbox. The former process is definitely a part of the Velux binding code; the latter is not.

So what process could be removing things from your Inbox? There are only a few possibilities..

  1. An older version of the Velux binding jar: The official release version of the JAR had code in it to remove entries from the Inbox. However all that code has been deleted from the new jar (link sent to you above). => So we need to be absolutely sure which version you are running. Perhaps there is an older version of the jar in cache? What is your system clock time? Perhaps it does not see the new jar as being newer? Or is the cache folder write protected?

  2. The OH core 'per se': I think this is unlikely since I, and various other testers, have the same core, and we do not see this issue.

  3. The OH core, mediated by something very particular to your own OH setup: => Can you please provide more information about your setup?

  4. Something in one of your Rules: => Can you please send me your rules file?

  5. Another binding in your system (i.e. one that I do not have installed): => What other bindings do you have installed?

andrewfg commented 3 years ago

@EjvindHald I have been digging around in the OH core code, and may have found a bug therein; please see this ISSUE

As a work around, I have completely rewritten the Velux discovery implementation so that it does not use the MDNSDiscoveryService class any more. This may solve your problem. The new build is HERE

EjvindHald commented 3 years ago

Hi Thank you. I have tried your new build - unfortunately the result is the same. So I upgraded to 2.5.10, but the problem is still there.

The response to Karaf Console command "bundle:list org.openhab.binding.velux" shows version 2.5.10.202011121827 as the only one Active. Please check this corresponds to your latest version. Are you changing the version number after each build? I think this is important, because otherwise OH might do some caching.

Several times I have stopped openhab, and removed all files under folders /cache and /tmp and when I started openhab new files were added again to these folders. Therefore, I assume all permissions are ok.

The time of the server is GMT+1, which is CET time (Copenhagen time). My setup is Docker, so everything is running in a container as described on this link: https://www.openhab.org/docs/installation/docker.html

I am using these bindings: hue,mqtt,modbus,rfxcom,ihc and resol from MarketPlace. Hue discovers automaticly my bulbs and that is working fine. Finally, I have of course the Velux binding.

I have 9 rule files - to begin with I can mail you the Velux one only.


I am working in a large company and the networking guys really do not like mDNS and Apple Bonjour, which uses this one. This is because it is continusly spamming the network with broadcast messages, so they have a hard time getting good performance on the network due to this.

I do not know how you are making the autodiscovery for Velux, but if you are continuesly searching for devices, I would really like a way of turning that off to avoid network traffic not needed. My KLF 200 has a fixed ip adress, and it is defined in the Thing file, so I will never need this autodiscovery.

EjvindHald commented 3 years ago

By the way, openhab comes with these messages upon startup: `2020-11-18 19:24:02.648 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 172.17.0.1

2020-11-18 19:24:02.649 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 192.168.1.234

2020-11-18 19:24:03.840 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 172.17.0.1

2020-11-18 19:24:03.841 [WARN ] [g.eclipse.smarthome.core.net.NetUtil] - Found multiple local interfaces - ignoring 192.168.1.234`

The ipadress of my server is 192.168.1.234 which is the same subnet as KLF 200. The subnet 172.17.0.1 is some local network which Docker adds - I do not know so much about this, and I also do not know if it is related. The configuration is simply made according to the openhab Docker page as shown.

andrewfg commented 3 years ago

shows version 2.5.10.202011121827

That is the jar version from one week ago. i.e several builds ago. The latest jar version is 2.5.10.202011181619

https://github.com/andrewfg/openhab-addons/blob/velux-backport/bundles/org.openhab.binding.velux/org.openhab.binding.velux-2.5.10-SNAPSHOT.jar

EjvindHald commented 3 years ago

Thank you very much - eveything is finally working as intended.