knxd / knxd

GNU General Public License v2.0
538 stars 201 forks source link

knxd stops "operation" after 10..20 minutes #234

Closed hilddav closed 7 years ago

hilddav commented 7 years ago

Dear all,

I have compiled "knxd" version 0.12.14:e6a68c8 on Raspberry Pi 2 with Raspbian Jessie. As KNX interface I use the following product:

I have started the daemon with several different parameters as test, as e.g.: "KNXD_OPTIONS="--eibaddr=0.0.1 --client-addrs=0.0.2:4 -c -d -D -T -R -S -i --listen-local=/tmp/knx -b usb:"

Initially the daemon is working fine and I may connect via OpenRemote to my KNX network, which is also running on this Raspberry. This was also the case when I previously used "eibd", but this is a complete new installation. During execution I see the following error: "knxd: W00000035: SendError", but the daemon is still running until there is no more connection/routing to the KNX network, e.g.: "knxtool groupswrite ip:localhost 3/1/35 1". The command is executed without any error message.

Does somebody have an idea what I am making wrong, how I may fix this problem? My family and I would be very happy if our Smarthome is properly running again.

Many thanks in advanced and a big thank you to Matthias (smurfix) for his engagement.

Best Regards, I hope to ready some answers, David

smurfix commented 7 years ago

Looks like the USB transfer ran into an error; the code to restart it is probably buggy (stuff like that tends to be not well tested). I'll have a look.

smurfix commented 7 years ago

There's a number after the SendError. Please tell me that. Also please run with -t1023 (directly in front of the -b usb:), reproduce the problem, and send me some context before+after that message; I'd like to see the transmission attempt(s) after hitting this error.

The current code looks correct; I might have to add some code to restart the interface.

stuckmcx commented 7 years ago

I have an off topic question, i have the same interface but i didnt receive anything, from the wall switches, do you receive something, when you press a wall switch?

hilddav commented 7 years ago

@stuckmcx : Yes, this is working and I also see it in the log file, e.g.:

knxd: Layer 2 [ 7:usb: 101.706] Recv L_Data low from 1.1.33 to 3/1/0 hops: 06 T_DATA_XXX_REQ A_GroupValue_Write (small) 01 knxd: Layer 9 [ 7:usb: 101.706] Queue L_Data low from 1.1.33 to 3/1/0 hops: 06 T_DATA_XXX_REQ A_GroupValue_Write (small) 01 knxd: Layer 3 [ 2:layer3 101.707] Route L_Data low from 1.1.33 to 3/1/0 hops: 05 T_DATA_XXX_REQ A_GroupValue_Write (small) 01 knxd: Layer 8 [ 4:mcast:knxd 101.707] Send_Route L_Data low from 1.1.33 to 3/1/0 hops: 05 T_DATA_XXX_REQ A_GroupValue_Write (small) 01 knxd: Layer 2 [ 7:usb: 105.762] Recv L_Data low from 1.1.33 to 3/1/0 hops: 06 T_DATA_XXX_REQ A_GroupValue_Write (small) 00 knxd: Layer 9 [ 7:usb: 105.762] Queue L_Data low from 1.1.33 to 3/1/0 hops: 06 T_DATA_XXX_REQ A_GroupValue_Write (small) 00 knxd: Layer 3 [ 2:layer3 105.763] Route L_Data low from 1.1.33 to 3/1/0 hops: 05 T_DATA_XXX_REQ A_GroupValue_Write (small) 00 knxd: Layer 8 [ 4:mcast:knxd 105.763] Send_Route L_Data low from 1.1.33 to 3/1/0 hops: 05 T_DATA_XXX_REQ A_GroupValue_Write (small) 00

hilddav commented 7 years ago

@smurfix : I started the process almost one hour ago, but the program is still running without problems. My start command is:

> sudo knxd -t1023 -e 0.0.1 -E 0.0.2:4 -c -D -T -R -S -i --listen-local=/tmp/knx -b usb: > 2017-03-15_KNXD.logs

I will keep an eye open and come back as soon as it stops...

stuckmcx commented 7 years ago

@hilddav I dont understand it any more ... same raspi, same interface, i receive nothing when i press a wall switch ... can you give me a install instruction what you did?

hilddav commented 7 years ago

For installation I have just used this script: http://michlstechblog.info/blog/download/electronic/install_knxd_systemd.sh from the following website: http://michlstechblog.info/blog/raspberry-pi-eibknx-ip-gateway-and-router-with-knxd/

but I checked the content a little bit before and made one change. You have to delete the installation package "libsystemd-daemon-dev" in line 72. This is covered directly by "libsystemd-dev", otherwise the script will just stop with an error message.

Afterwards you may start the program with the command I listed above and it should already work. (Now my knx daemon stopped. I will check the log file).

hilddav commented 7 years ago

@smurfix : Please find attached the log file. Unfortunately I was not able just to find the area where it was not working any more. Interesting is that I did not had problems when the log file was generated and at the same time it was running on my screen. Maybe an issue with timing? As soon as I only generated the log file knxd stopped operation quite fast.

2017-03-15_KNXD2.log.zip

hilddav commented 7 years ago

So, for easier reading I converted the log file and cut it a lot. Now I marked the command I sent (to address 3/1/35). @smurfix I hope it will help you to identify the problem.

log-short.docx

smurfix commented 7 years ago

knxd: Layer 1 [ 7:usb: 2718.573] Send(064): 01 13 11 00 08 00 09 01 01 00 00 11 08 00 00 19 23 D1 00 81 00 … This says quite plainly that the message was sent to USB. However, the strange part is that the message ends up with a zeroed-out source address (the two bytes before "19 23"), which I can't understand at all.

Could you test the v0.14 branch?

smurfix commented 7 years ago

Closed due to reporter inactivity. Please re-open as appropriate.

hilddav commented 7 years ago

Hello Smurfix,

I was on a trip and had no time to follow up this issue and it is also a little bit tricky to catch the right moment in the (extreme big) log file. Now I got it again and hope you will support me to close this issue. The knxd software has been updated to the v0.14 branch, but the situation is still the same. Please find the new log file attached. Please look for group address 3/1/35 and you will directly see the failed transmission of the command.

knxd.log.026.txt

smurfix commented 7 years ago

Thank you. Got it. That's a USB send transfer time-out which frankly should not happen unless you have flaky hardware (marginal power supply, interference, whatever) or a very busy USB (on the Pi, this includes networking).

knxd will be able to recover from this soon(ish); I'm already working on it. In the meantime you have a couple of options:

Feel free to share which change worked.

smurfix commented 7 years ago

Also, commit 6703d3e (written right now) should retry transmission up to three times without resetting the whole interface. This change may or may not work in your situation; please give it a try.

hilddav commented 7 years ago

Quite interesting points you are listing up. Your are right, my power supply might not be the best and I changed it to a new one with 2A, but I am not easily able to increase the voltage. I would have a variable voltage source, but only for testing. So I would like to try it with this at first.

I would also have a persistence oscilloscope to track the voltage over the time, but the place is not very convenient to set it up, but we could do if we have strong interest. In addition to the USB interface for KNX there is a second USB port used for 1Wire with a Wiregate adapter.

It's maybe not the best approach to mix it up, but I also compiled your revised SW version and had a look to your changes. Unfortunately I am not yet so familiar with GitHub, but I wonder that after compilation I still get the SW version knxd 0.12.14:e6a68c8 when I call "knxd -V". When I recheck my selected branch with gitk, I see your last changes and the master is set to the newest one. Anyhow, I keep it running until tomorrow evening and reply back about my experiences.

hilddav commented 7 years ago

Just a short update (already done yesterday):

  1. I changed the power supply (2A), but the transmission stopped after a while
  2. I compiled the new SW version (6703d3e), but still used the "old / intial" one, because I still had a copy of knxd in /usr/local/bin

Now, a new try with the correct version. Currently "knxd" is still in operation. Is there a way to reduce the traffic of the log file or concentrate it to the issue we have. Unfortunately it is getting so big, that it is difficult to keep it running and to wait for the event of your SW change to see the details.

smurfix commented 7 years ago

Well … start with systemd and you get automatic logfile rotation. Alternately, write the log to a file (Option "-d filename"), then if you rename the file and send a SIGHUP to knxd it'll close and re-open it.

hilddav commented 7 years ago

I will do so. "knxd" is running well now and since yesterday I have not experienced any problems. I will keep an eye open and will inform you if I identify something irregular.

THANKS A LOT FOR YOUR GREAT SUPPORT and your efforts you spend. Great job!!