Closed MMax2 closed 8 years ago
Same behavior with Jung KNX IP-Router REG IPR 100 REG. Massimo
Hi @MMax2 could you please start in debug mode (start_debug.xxx) and post the openhab.log of a clean session where you've walked through the steps mentioned above? Thanks, Thomas E.-E.
Hi! Sorry for the delay, but I investigated further about this issue, and I prepared a very reduced sample test environment which replicates the issue.
File openhab.cfg:
knx:ip=192.168.1.52
knx:type=TUNNEL
knx:port=3671
knx:pause=50
knx:autoReconnectPeriod=30
File test.items:
Switch Luce_1 {knx="1.001:0/0/1+<0/7/1"}
File test.sitemap:
sitemap test label="test" {
Frame label="Home" {
Switch item=Luce_1
}
}
In the webapps directory I added a "test" directory with jquery 1.11.0 and the following: File index.html:
<!DOCTYPE html>
<html>
<head>
<title></title>
<script src="jquery-1.11.0.min.js" type="text/javascript"></script>
<script src="index.js" type="text/javascript"></script>
</head>
<body>
<button id="btnStart">Start</button>
</body>
</html>
File index.js:
$(function () {
$("#btnStart").on("click", function () {start();});
});
function start() {
var request = $.ajax({
type: "POST",
url: "http://localhost:8080/rest/items/Luce_1",
data: "ON",
headers: { 'Content-Type': 'text/plain' }
});
}
The function start() in index.js is exactly the same as in the REST Samples.
Now I follow these steps:
If you try the same steps with Classic UI, it works correctly, but the server doesn't stop if you stop it with CTRL-C: it says Stopped REST API and Stopped Classic UI, but the dos windows continues to receive knx events.
Thank you Massimo
Same result with OH 1.5.0. Massimo
Hi Massimo, thanks for investigating further! Something seems to block the whole eventing mechanism here. Did you already have a look into the knx binding code? Any idea what could cause this blocking behaviour? Best, Thomas E.-E.
In my opinion, issue #851 is different then #1068. In issue #851 we were asking for Knx to reconnect even on startup. At present Knx will reconnect only if the first connection on startup is successful, otherwise it never reconnects. In issue #1068 instead I noticed this strange blocking behaviour when Knx tries to reconnect. So I ask you to reopen issue #851. As far as I'm concerned, I'll try to have a look into the knx binding code to understand where the block is, but now I don't have time to do so quickly. So please be patient!
Issue #851 differs IMHO from #1068. The binding tries to connect for the first time, when the binding receives an update message from OSGI for it's config data. When connection at this point fails, no further try is started. I'm currently implementing a timer based reconnect.
I've provided a fix for #851 for branch 1.5.1, which tackles the problem of KNX being unavailable at startup for TUNNEL connections.
But I couldn't reproduce the original described erroneous behavior (with a Siemens IP interface, though). Even when I tried a fresh install (1.5.1) with your test configs. Could you (by any chance) provide a debug log based on 1.5.1?
The behavior I'm seeing is that after a connection is lost it takes a while until the IP interface gets it's internal state sorted and a few reconnects fail. After that I'm always getting a working connection.
Thank you for your fix. I can't test quickly because I'm involved in another project at the moment, but when I went back to openHab surely I provide the debug log. Sorry Massimo
Guess I found the real issue. Calimero seems not to be thread safe (at least not the way it's currently being used by the knx binding). When the initial connection is lost a timer thread is started trying to reconnect. It appears that after reconnecting and when the connection is lost (again), then calimero seems to wait for the (new) thread to terminate, which is not the intend of this thread. No idea yet how to solve this one.
Wow! Very hard-to-find issue, congratulations! In your opinion, is it a Calimero issue or is it a OpenHAB issue due to the way it's using Calimero?
Not really sure. Could be both, since Calimero docs don't seem to touch the issue of multi-threading (at least I couldn't find anything).
Hi together! Today I came across this described issu, after I made some software updates on my fritzbox yesterday! As I did some review on my openhab server today, I determined that the knx connection was lost yesterday and not reconnecteed itself after my fritzbox was alive again.
I'm on 1.5.1. As I can see the issue was removed from 1.6.0 milestone! So I think the problem still exists on 1.6.1. Will it be analyzed furthermore? Thank you very much!
I don't understand: on what version did you find the issue? 1.5.1 or 1.6.1?
Hi @MMax2, I'm on 1.5.1 right now! I saw above that @teichsta removed the issue from 1.6.0 milestone, so I thought it is still not fixed yet in 1.6.1. I'm right? Thx!
Hi,
@Snickermicker send a fix for #851 with PR https://github.com/openhab/openhab/pull/1483 for 1.5.1 these days. It seems we've missed to cherry pick this fix into 1.6.0. I am not sure anymore if he sent a second PR for 1.6.0, too. I've asked for a short update on this.
Best, Thomas E.-E.
Yes, I merged that fix into 1.6.0 with #1517. But as I wrote before, this is only partly fixing the problem. I saw a problem when connection is lost in gateway mode. At first glance it looked as if the binding is stuck sporadically in calimero lib. Contacting the calimero maintainers didn't reveal any insights. So, this is currently unfixed.
ok … so we have to partial fix on 1.6 already but this did not entirely fix the problem.
@Snickermicker could you please add the link to the Calimero Issue here as a Reference? Hope to get them moving a bit.
Thanks, Thomas E.-E.
Sure thing: calimero #14
Thanks! Have you had the time to follow their suggestions (check method fire())?
Yes, but this didn't help me.
@Snickermicker has this been solved meanwhile ? I do not see any activity for more than a year and would rather close this issue.
I tested it with a Jung IP-Schnittstelle IPS 200 REG (LAN cable). OpenHAB 1.4.0 stable version. I find it very easy to reproduce the problem: after all the system is up and running, I unplug the LAN cable and try to send a command to the Knx by the UI. The server says: KNX link has been lost (reason: maximum send attempts on object link 192.168.1.27:3671 tunnelling mode (closed), TP1 hopcount 6) - reconnecting... And after 16 ms: Error connecting to KNX bus: null Then: KNX link has been lost! And: KNX link will be retried in 30 seconds. After that, all other bindings stop working. Then I reconnect the cable, but other plugins don't work. After 30 seconds the server says: Estabilished connection to KNX bus on 192.168.1.27:3671 in mode TUNNEL. Now, if for example I change the state of a switch, nothing is displayed on the server. I think knx binding stops working, too, like other plugins. Almost never it reconnects correctly: most of the times it fails. Massimo
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.