mjg59 / wink-relay-handler

Use a Wink Relay as a generic MQTT sensor, switch and control device
Other
24 stars 7 forks source link

MQTT transmissions stall after a while #3

Open okow opened 6 years ago

okow commented 6 years ago

Hello,

first of all a big "Thank you" for this piece of software. It's exactly what i need to use a wink relay together with FHEM (a home automation package quite common here in Germany).

I managed to install it on my relay (after some fiddling, the relay did not always accept adb root-connections, could not identify the system when and when not) and it works - but only for a while (= some minutes).

Then it stops transmitting the sensor/switch values as well as not receiving relay-actions. The screen-controll however still works (both, touch and proximity) so i suppose the handler still runs in the background.

Any idea what's going wrong? Where can i start debugging? I would love to help improving this package ;-)

Regards Olaf

okow commented 6 years ago

I played a little bit more with the relay in the meantime, looks like it just looses the network socket to the MQTT-server after a relatively short period without messages (= traffic). Since till today all of my testing went over a VPN line it might be (partially) caused by this, over the weekend i'll try it completely within a local network. I'll also would like to supplement the code with a kind of heartbeat, maybe an increasing counter send out every minute? May i suggest "Relay/status/heartbeat", maybe optional by another config-entry? I allready downloaded the SDK and NDK and will try to code the necessary sources.

Let me know what you think...

Regards Olaf

rs1932 commented 6 years ago

Olaf, Did you figure this problem? I have the same problem, everything is on local servers. If the MQTT server crashes or restarts the wink relay loses connectivity and is not able to send messages. Only a reboot of the relay helps. Will a heartbeat help in this situation? thx RS

mjg59 commented 6 years ago

A heartbeat would be easy enough, but I'll need to look at the spec to figure out what the best way to handle it is. @rs1932 I suspect this may not be the same issue (although it could be), so could you open a separate issue and we'll see if the same thing ends up fixing both?

okow commented 6 years ago

In the meantime i tested some more and dug a little further in the code and MQTT-specs.

On the broker i found log-entries "Client Wink_Relay1 has exceeded timeout, disconnecting.", so the TCP-connection obviously gets closed by the broker (in accordance with the MQTT-protocol)

So in my eyes we have two things to do:

Since this is all new for me i'm still not ready to compile a test-version on my own (i have to understand the workflow in AndroidStudio and also the code before i change it ;-) ). @mjg59 , could you look in this direction if you have time?

BTW: since i now have learned there's a builtin keep-alive mechanism in the protocol i would withdraw my suggestion to implement a heartbeat because this would be kind of duplicate. I think at the moment our efforts will be the most profitable at debugging the keep-alive...

okow commented 6 years ago

Aarghh... I can't manage the build-process?! Could someone please give me some hints...

I installed AndroidStudio, added the NDK via "Settings-->SystemSettings-->Android SDK-->SDK Tools" and then imported the sources/the project via git into the IDE (as offered in the startup-screen of AS).

But if i now try to build the code via "Build-->Make Module" the event-log tells me "All files are up-to-date" even if i made changes (even intended typos don't change anything) and consequently no binary is compiled.

mjg59 commented 6 years ago

Yeah I'm afraid you can't build it under Android Studio - it's an actual native Linux app, rather than an Android app with native code. You'll need a Linux system with the NDK installed, then set the ANDROID_NDK environment variable to the path the NDK was installed under and just run "make". It's possible that the MQTT library I'm using doesn't do heartbeats automatically, but I'll look into what's required to make that happen.

rs1932 commented 6 years ago

Was looking a little into the source code and the MQTT library you used, I think you used the https://github.com/eclipse/paho.mqtt.embedded-c/blob/master/MQTTClient-C/src/MQTTClient.c version?

I believe this is the embedded version of MQTT and according to the website (http://www.eclipse.org/paho/downloads.php) the embedded version doesn't support automatic reconnects which would explain why when the server goes down the relay stops communicating to the server.

Not sure if this is the issue but if you could take a look and confirm. Thanks

thatkide commented 6 years ago

I have this disconnect issue as well, I have 3 relay's and eventually they will all stop talking to the MQTT broker but I can ping all the relays.

okow commented 6 years ago

@mjg59 ok, understood. So i downloaded and installed the NDK for Linux. But now i get a lot of trash on the screen when run the make. After a look in the Makefile i suppose (because of CC= .... arm-linux-androideabi-gcc) you work on an ARM-machine for development (raspi?). Right? So for me to make it work on ubuntu/x86-64 i would have to adjust the binary of the cross-compiler? Do you know if there are Makefile variables for this purpose (architecture of the build-system), i could only find examples for different targets...

mjg59 commented 6 years ago

No, the arm-linux-androideabi-gcc means that it's a version of gcc that cross-compiles to the arm architecture. What are the errors you get?

okow commented 6 years ago

I have opened a new issu since my build-environment has nothing to do with the original problem...

DanielGalle commented 6 years ago

First of all I want to thank you for this awesome piece of software as well. It works fine for me after the Relay was booted but after some time (differs) I get disconnected as well :(

It would be awesome if you could fix this so that the WAF will raise again. Haha

Merry X-Mas!!

roch23 commented 6 years ago

I’m not familiar enough with the implementation to know it is related, but I the Relay buttons stop controlling the load entirely if it loosing connection with the broker for 24 hours or so. Even after a reset the buttons still wouldn’t respond. I noticed this when I had to take HASS offline for a while.

nicecube commented 6 years ago

My Wink Relay continues to disconnect from the MQTT server, I have to restart my relay 2-3 times a day. My server is still online and has no interruptions. All my other devices work well with MQTT I do not understand why my relay loses connection. Have you found a solution to keep a consistent connection

marthoc commented 6 years ago

@mjg59 I have some ideas about how to solve this.

First, I think that part of the problem is that inside while (1), when MQTTYield returns < 0, you call mqtt_connect but then don't subscribe to the topics again. I think calling MQTTConnect with data.cleansession = 1 inside mqtt_connect means that the broker's not going to remember the subs. So probably a first solution is to move the subscriptions inside mqtt_connect after MQTTConnect returns 0 so that when there is a disconnect/reconnect, we pick up the subscriptions again.

I think updating the paho embedded-c library code is another step forward (there are a few improvements to MQTTYield, at least), but I've had mixed results in testing after compiling against the newest paho embedded master. I think the issue is in MQTTYield but I haven't tracked it down. The newest Paho has a library function MQTTIsConnected() which returns a truth value if connected, so the tail end of the code in while (1) could become:

MQTTYield(&c, 100);

if (!MQTTIsConnected(&c))
    mqtt_connect(&n, &c, buf, readbuf);

I'm going to open a WIP PR to show you what I'm thinking.

mjg59 commented 6 years ago

Could you try https://github.com/jimpastos/wink-relay-manager as a replacement? It's using a more up to date MQTT stack and has more functionality than this code.