openhab / openhab1-addons

Add-ons for openHAB 1.x
Eclipse Public License 2.0
3.43k stars 1.71k forks source link

ZWave message duplication on RPi #1564

Closed cdjackson closed 7 years ago

cdjackson commented 9 years ago

Messages from the zwave controller are duplicated on the RPi.

See https://groups.google.com/d/msg/openhab/_Wh_iurrIIo/6QH5gLimgNEJ

For example, right at the beginning of the log, we send out this request -:

2014-04-13 20:13:44 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveSendThread[:763]- Sending Message = 01 03 00 07 FB 

This is the only time that this request is sent, but the response comes in 4 times

2014-04-13 20:13:44 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:913]- Receive Message = 01 2B 01 07 … 
2014-04-13 20:13:46 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:913]- Receive Message = 01 2B 01 07 … 
2014-04-13 20:13:49 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:913]- Receive Message = 01 2B 01 07 … 
2014-04-13 20:13:50 DEBUG o.o.b.z.i.p.ZWaveController$ZWaveReceiveThread[:913]- Receive Message = 01 2B 01 07 … 

The problem gets worse as I think there’s a cascade effect. The response to the initial request would normally cause the subsequent message to be queued, but because the initial message gets 4 responses, the next message gets requested 4 times - and I think each on of these 4 requests gets 4 responses and by the time we get to the MessageSerialApiGetInitData, we have 30 or so responses….

cdjackson commented 9 years ago

If someone can test the latest version on the HABmin site (https://github.com/cdjackson/HABmin/blob/master/addons/org.openhab.binding.zwave-1.7.0-SNAPSHOT.jar) I'd appreciate any feedback if this problem still exists. I fixed a problem where if it takes longer than 2 seconds to process a frame, the zstick would retry. We now send the ACK immediately and queue all received frames for later procesing thus separating the two, and ensuring we are responsive with communications to the zstick.

avdleeuw commented 9 years ago

I'm running on a Banana Pi currently, which is the more powerful version of the RPi. I'm running the new binding since last friday (Jan. 9). When did you fix this issue?

I will try to look back through the logs to see if this issue occured for me, and if it did, I will try to see if it got fixed with this change.

cdjackson commented 9 years ago

I made the change yesterday - probably uploaded mid afternoon. I'd be really interested to see a logfile from your current system, and then another using the new binding to see what difference this has made.

The reason for picking up on this is I realised that under certain circumstances, I was also getting duplication in my logs - mainly during initialisation when the message that tells us what nodes are in the network gets processed. This takes 5 seconds on my computer, and with the way the binding was written, it didn't send an ACK to the stick, so it kept sending this message every 2 seconds which then causes problems for a while to come...

cdjackson commented 9 years ago

@gilles76 - I think I've seen duplication in your logs recently - can you confirm that this is fixed with the latest test release?

avdleeuw commented 9 years ago

@cdjackson Chris, I sent you the first logfile by email. Now running with the new binding and will send the current logfile later.

cdjackson commented 9 years ago

Got it - thanks. I'll take a look when I get home - it will be interesting to see how the next log looks :)

ghost commented 9 years ago

Not easy to find that a message is received once but I can give you a log after a restart withnew binding. I will give you with another log with a light device to see if it's ok. Chris I'm not on RPI no more for openhab. I use LXC virtualization and an ubuntu distribution on an atom processor. A D525 I think.

I could have a look on a cubie board 2 if you want. I've just mail you two logs.

ghost commented 9 years ago

Just another remark, all my zwave events are in double. Is there a common point between events and binding ?

cdjackson commented 9 years ago

Just another remark, all my zwave events are in double. Is there a common point between events and binding ?

You mean in your events.log? I just looked and I don’t see that here - I’m not sure why that would be - is it all events, or just certain ones?

ghost commented 9 years ago

Yes in events.log.

Ho I'm a lier. It seems it's ok.

Here is an example, I activate a light and another device.:

Lumiere_Salon received command ON 2015-01-12 21:57:03 - Prise_Salon received command ON 2015-01-12 21:57:03 - Prise_Salon state updated to ON 2015-01-12 21:57:07 - arrosage_voie_1 state updated to OFF 2015-01-12 21:57:10 - arrosage_voie_2 state updated to OFF 2015-01-12 21:57:14 - Lumiere_Salon state updated to OFF 2015-01-12 21:57:14 - Lumiere_Salon received command OFF 2015-01-12 21:57:21 - Prise_Salon received command OFF 2015-01-12 21:57:21 - Prise_Salon state updated to OFF

So it's ok.

Do you think it's the binding ?

cdjackson commented 9 years ago

So it's ok.

Good :)

Do you think it's the binding ?

You now mean the ‘double node 71’ problem? I’m not sure - I need to have a look at it to see where this could be. Can you send me the XML for this node, or is the log you sent earlier from a startup of the binding? If so, it might capture the information I need so I can look at that (but maybe not tonight - I have some things to do as I go to Paris on Wednesday).

tudstudent commented 9 years ago

Is there a status update concerning these binding issues?

Was test running on Windows and hopped over this evening to my RPI (bought for this purpose before reading this) and I have the issue with the double messages etc. Not a programmer, but...

cdjackson commented 9 years ago

Please advise what version you are using.

tudstudent commented 9 years ago

Latest snapshot (.zip) from https://openhab.ci.cloudbees.com/job/openHAB/ 1.7.0

cdjackson commented 9 years ago

Ok - no change is expected then. Can you try the test version here -: https://github.com/cdjackson/HABmin/blob/master/addons/org.openhab.binding.zwave-1.7.0-SNAPSHOT.jar

I’ve made some changes that I hope will help, so I’d appreciate any feedback.

tudstudent commented 9 years ago

Chris,

Currently testing. Will give you the test setup:

On a RPI B+ with raspbina Wheezy (all up-to-date and firmware as well).

Result, it is not loading the binding?!? - no idea why, tomorrow evening I have time to further investigate (to see if it is is the jar I downloaded or my config -> did the download twice so it is definitely not the download itself what made the file e.g. corrupt):

pi@raspberrypi /opt/openhab_1.7 $ sudo ./start.sh
Launching the openHAB runtime...
osgi> 2014-12-24 15:56:20.726 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-24 15:56:55.630 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-24 15:56:57.229 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-24 15:57:16.254 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-24 15:57:21.874 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-24 15:57:22.493 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-24 15:57:50.297 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-24 15:58:16.749 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-24 15:58:24.243 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-24 15:58:50.116 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-24 15:58:54.480 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-24 15:58:55.413 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T15:58:55
2014-12-24 15:59:55.565 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T15:59:55
2014-12-24 16:00:55.744 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:00:55
....
cdjackson commented 9 years ago

Probably a stupid point, but from the link I sent you, make sure you click on the RAW button to download - otherwise the download an get screwed up.=

tudstudent commented 9 years ago

Retried, with https://github.com/cdjackson/HABmin/raw/master/addons/org.openhab.binding.zwave-1.7.0-SNAPSHOT.jar Which is not the "blob" link, let assume I had a long day yesterday :)

However, just starting openHAB. Underneath the code: (looks like it is not working)

pi@raspberrypi /opt/openhab_1.7/addons $ ls
org.openhab.binding.ntp-1.7.0-SNAPSHOT.jar  org.openhab.binding.zwave-1.7.0-SNAPSHOT.jar  org.openhab.io.habmin-1.7.0-SNAPSHOT.jar  README
pi@raspberrypi /opt/openhab_1.7/addons $ cd ..
pi@raspberrypi /opt/openhab_1.7 $ sudo ./start.sh
Launching the openHAB runtime...
osgi> 2014-12-24 16:13:01.158 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-24 16:13:39.061 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-24 16:13:40.862 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-24 16:13:58.512 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-24 16:14:07.618 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-24 16:14:08.307 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-24 16:14:34.580 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-24 16:15:06.385 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-24 16:15:15.323 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-24 16:15:31.577 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-24 16:15:54.292 [INFO ] [.o.io.habmin.HABminApplication] - Started HABmin REST API at /services/habmin
2014-12-24 16:16:00.132 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-24 16:16:01.499 [INFO ] [.z.internal.ZWaveActiveBinding] - Update config, port = /dev/ttyUSB0
2014-12-24 16:16:01.550 [INFO ] [.service.AbstractActiveService] - ZWave Refresh Service has been started
2014-12-24 16:16:01.609 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:16:00
2014-12-24 16:16:02.157 [INFO ] [b.z.i.protocol.ZWaveController] - Starting Z-Wave controller
2014-12-24 16:16:02.164 [INFO ] [b.z.i.protocol.ZWaveController] - Z-Wave timeout is set to 5000ms. Soft reset is false.
2014-12-24 16:16:02.171 [INFO ] [b.z.i.protocol.ZWaveController] - Connecting to serial port /dev/ttyUSB0
2014-12-24 16:16:03.180 [INFO ] [b.z.i.protocol.ZWaveController] - Serial port is initialized
2014-12-24 16:16:04.541 [WARN ] [ApplicationCommandMessageClass] - NODE 2: Not initialized yet, ignoring message.
2014-12-24 16:16:04.658 [WARN ] [ApplicationCommandMessageClass] - NODE 2: Not initialized yet, ignoring message.
2014-12-24 16:16:04.805 [WARN ] [ApplicationCommandMessageClass] - NODE 2: Not initialized yet, ignoring message.
2014-12-24 16:16:06.858 [INFO ] [rialApiGetInitDataMessageClass] - NODE 1: Node found
2014-12-24 16:16:06.866 [INFO ] [rialApiGetInitDataMessageClass] - NODE 2: Node found
2014-12-24 16:16:06.872 [INFO ] [rialApiGetInitDataMessageClass] - NODE 3: Node found
2014-12-24 16:16:06.882 [INFO ] [rialApiGetInitDataMessageClass] - ZWave Controller using Controller API
2014-12-24 16:16:06.888 [INFO ] [rialApiGetInitDataMessageClass] - ZWave Controller is Primary Controller
2014-12-24 16:16:06.899 [INFO ] [rialApiGetInitDataMessageClass] - ------------Number of Nodes Found Registered to ZWave Controller------------
2014-12-24 16:16:06.908 [INFO ] [rialApiGetInitDataMessageClass] - # Nodes = 3
2014-12-24 16:16:06.917 [INFO ] [rialApiGetInitDataMessageClass] - ----------------------------------------------------------------------------
2014-12-24 16:16:18.386 [ERROR] [b.z.i.protocol.ZWaveController] - NODE 2: Restore from config: Error deserialising XML file. com.thoughtworks.xstream.converters.ConversionException: null : null
---- Debugging information ----
cause-exception     : java.lang.RuntimeException
cause-message       : null
class               : java.util.HashMap
required-type       : java.util.HashMap
converter-type      : com.thoughtworks.xstream.converters.collections.MapConverter
path                : /node/supportedCommandClasses/entry[3]/multiLevelSensorCommandClass/sensors/sensorType
line number         : 37
class[1]            : org.openhab.binding.zwave.internal.protocol.commandclass.ZWaveMultiLevelSensorCommandClass
converter-type[1]   : com.thoughtworks.xstream.converters.reflection.ReflectionConverter
class[2]            : org.openhab.binding.zwave.internal.protocol.ZWaveNode
version             : 1.4.6
-------------------------------
2014-12-24 16:16:20.384 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class SWITCH_ALL
2014-12-24 16:16:27.670 [ERROR] [WaveController$ZWaveSendThread] - NODE 255: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-24 16:16:50.400 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-24 16:16:50.600 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:17:01.750 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-24 16:17:01.920 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:17:02.014 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:17:01
2014-12-24 16:17:07.741 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:12.795 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:17.895 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:23.002 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:28.094 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:33.991 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:17:37.452 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:17:38.390 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 312
2014-12-24 16:17:38.920 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-24 16:17:44.647 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 1 attempts left!
2014-12-24 16:17:44.769 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:17:47.894 [WARN ] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Already processed another send data request for this callback Id, ignoring.
2014-12-24 16:17:48.273 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-24 16:17:50.216 [WARN ] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Already processed another send data request for this callback Id, ignoring.
2014-12-24 16:17:55.400 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-24 16:17:55.579 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:18:00.681 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:18:06.550 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-24 16:18:12.419 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 1 attempts left!
2014-12-24 16:18:12.609 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:18:17.676 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:18:23.130 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 0 attempts left!
2014-12-24 16:18:23.195 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:18:29.481 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 312
2014-12-24 16:18:29.970 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 312
2014-12-24 16:18:30.370 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 312
2014-12-24 16:18:30.771 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:30.984 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:31.083 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:31.063 [WARN ] [WaveController$ZWaveSendThread] - NODE 2: Too many retries. Discarding message: Message: class = SendData (0x13), type = Request (0x00), payload = 02 02 84 05
2014-12-24 16:18:31.196 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:31.293 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.3
2014-12-24 16:18:31.381 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.3
2014-12-24 16:18:31.516 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.3
2014-12-24 16:18:31.575 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.3
2014-12-24 16:18:31.713 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:18:02
2014-12-24 16:18:31.861 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:18:31.918 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 354
2014-12-24 16:18:32.472 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 354
2014-12-24 16:18:32.970 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 354
2014-12-24 16:18:33.450 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 354
2014-12-24 16:18:33.900 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:33.993 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.3
2014-12-24 16:18:39.131 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to CLOSED
2014-12-24 16:18:41.439 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:18:44.350 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 412
2014-12-24 16:18:47.030 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:18:49.645 [ERROR] [.o.b.z.i.p.c.ZWaveCommandClass] - Error extracting value - length=9, offset=6, size=3.
2014-12-24 16:18:52.399 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-24 16:18:52.475 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-24 16:18:57.545 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:19:02.566 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:19:02
2014-12-24 16:19:02.632 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-24 16:19:05.341 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:19:05.620 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 463
2014-12-24 16:19:05.871 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 463
2014-12-24 16:19:06.128 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 463
2014-12-24 16:19:06.580 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 463
2014-12-24 16:19:07.254 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 35
2014-12-24 16:19:07.332 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-24 16:19:07.522 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.2
cdjackson commented 9 years ago

I agree it doesn't look promising :( Can you generate a debug log and email it to me (chris -at- cd-jackson.com).

Thanks. Chris

avdleeuw commented 9 years ago

I have been wondering if this could be related to the java-rxtx library... I have seen some comments about its implementation around the web, but never really gotten around to comparing things.

cdjackson commented 9 years ago

I had a similar thought a while ago (in fact, that was my working assumption). Since it didn't seem to occur on other systems, I assumed this was to do with the RXTX lib. I did look around at other options, but it's limited unfortunately - at least this seems to have the best portability and support for the most systems...

Anyway, I recently found that in some instances on my machine I also got duplicates. This was associated with a specific message which took the binding about 5 seconds to process (it's dependant on network size - I have about 25 nodes).

I then found that the binding didn't ACK messages from the stick immediately, so if it took 5 seconds to process a message, then we send the ACK after 5 seconds. This is a problem since the stick has a retry mechanism - if it doesn't get the ACK within 2 seconds, it resends the last message.

I was hoping that with a slow computer (ie the Pi) that this might be happening a little more often. Also, its a bit of a cascade effect - if you can't process something within 2 seconds, and the stick retries, then you spend a bunch of time processing this, and it's retries, and messages start to queue up...

This should now be fixed, and ACKs are sent immediately, but there may still be some refinement. The Receive queue TIMEOUT message I see above is associated with this and is something I added to solve this problem. I forget exactly what it means, but the receive queue was added to separate the receive communication thread with the stick, and the processing, so I need to look at this.

avdleeuw commented 9 years ago

One more question on this issue... I'm using the Razberry zWave device myself and I'm seeing this duplication of messages.

What zWave controllers are other people using who are seeing this duplication on RPi like boards?

tudstudent commented 9 years ago

Chris,

I hope to generate the reports this afternoon/evening. However is it enough to send you the zwave logging, or would you like to have the log from openhab as well.

Zwave logging i found here: https://github.com/openhab/openhab/wiki/Z-Wave-Binding

Openhab logging (not to syslog) i have not yet found. Link?

From my earlier post, where does node 255 come from?

2014-12-24 16:16:27.670 [ERROR] [WaveController$ZWaveSendThread] - NODE 255: Timeout while sending message. Requeueing - 2 attempts left!

Because I only have node 1, 2 and 3: Node 1: Z-Stick aeon Node 2: Aeon Multisensor Node 3: Fibaro Switch

cdjackson commented 9 years ago

Just the zwave logs is probably ok. Everything will go in there unless there's an uncaught exception which ends up in the openhab log (this is generated automatically when you start using the run_debug, or start_debug file).

Node 255 is the controller. Note that the controller sort of has 2 nodes - it has it's Zwave node (probably 1 in your case), and this is used for association notifications etc (it's how the node is known within the network). Node 255 is a fake node that is used for special controller commands.

tudstudent commented 9 years ago

Just started with logging. So turned on the pi. Added the config script in the log.xml as stated on above link. Turned on the logging (in log specified "TRACE"). Then I had only the battery as error (1 duplicated entry):

root@raspberrypi:/opt/openhab_1.7# ./start.sh
Launching the openHAB runtime...
osgi> 2014-12-24 16:30:14.176 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-24 16:30:49.759 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-24 16:30:51.247 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-24 16:31:11.861 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-24 16:31:18.607 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-24 16:31:19.168 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-24 16:31:34.347 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-24 16:32:06.136 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-24 16:32:12.690 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-24 16:32:33.012 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-24 16:32:47.634 [INFO ] [.o.io.habmin.HABminApplication] - Started HABmin REST API at /services/habmin
2014-12-24 16:32:53.479 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-24 16:32:54.326 [INFO ] [.service.AbstractActiveService] - ZWave Refresh Service has been started
2014-12-24 16:32:54.858 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:32:54
2014-12-24 16:33:32.693 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-24 16:33:33.178 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-24 16:33:41.348 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:33:41.700 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 80
2014-12-24 16:33:45.433 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:33:45.450 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 81
2014-12-24 16:33:47.456 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:33:48.286 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-24 16:33:50.625 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:33:54.997 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:33:54
2014-12-24 16:33:57.780 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:34:00.052 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:34:00.599 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:34:08.067 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:34:41.332 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:34:41.613 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 78
2014-12-24 16:34:41.987 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:34:42.008 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:34:42.133 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to ON
2014-12-24 16:34:42.324 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.1
2014-12-24 16:34:56.816 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:34:56
2014-12-24 16:35:41.360 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:35:41.553 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 79
2014-12-24 16:35:41.947 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:35:42.078 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:35:42.245 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to ON
2014-12-24 16:35:42.586 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.1
2014-12-24 16:35:57.532 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:35:57
2014-12-24 16:36:41.376 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:36:41.562 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 61
2014-12-24 16:36:41.836 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:36:41.853 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:36:41.988 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to ON
2014-12-24 16:36:42.126 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.1
2014-12-24 16:36:58.793 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:36:58
2014-12-24 16:37:41.359 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:37:41.546 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 49
2014-12-24 16:37:41.840 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-24 16:37:41.868 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:37:41.967 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to ON
2014-12-24 16:37:42.149 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22
2014-12-24 16:37:59.777 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:37:59
2014-12-24 16:38:04.601 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to CLOSED
2014-12-24 16:38:19.286 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-24 16:38:41.414 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:38:41.646 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 894
2014-12-24 16:38:42.236 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:38:42.496 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 21.9
2014-12-24 16:38:43.299 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-24 16:38:43.387 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-24 16:39:01.194 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:39:01
2014-12-24 16:39:41.477 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:39:41.669 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 890
2014-12-24 16:39:41.994 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-24 16:39:42.016 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 40
2014-12-24 16:39:42.109 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-24 16:39:42.291 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 21.9
2014-12-24 16:40:01.764 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:40:01
^C2014-12-24 16:40:08.694 [INFO ] [penhab.io.rest.RESTApplication] - Stopped REST API
^C2014-12-24 16:40:14.048 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Stopped Classic UI
2014-12-24 16:40:14.097 [INFO ] [.o.io.habmin.HABminApplication] - Stopped HABmin REST API

Now the same but "DEBUG"

root@raspberrypi:/opt/openhab_1.7# ./start.sh
Launching the openHAB runtime...
osgi> 2014-12-24 16:42:46.628 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-24 16:43:20.999 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-24 16:43:22.416 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-24 16:43:40.654 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-24 16:43:47.430 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-24 16:43:48.196 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-24 16:44:14.460 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-24 16:44:39.892 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-24 16:44:47.018 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-24 16:45:01.992 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-24 16:45:28.688 [INFO ] [.o.io.habmin.HABminApplication] - Started HABmin REST API at /services/habmin
2014-12-24 16:45:35.398 [INFO ] [.service.AbstractActiveService] - ZWave Refresh Service has been started
RXTX Warning:  Removing stale lock file. /var/lock/LCK..ttyUSB0
2014-12-24 16:45:39.847 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-24 16:45:41.604 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:45:41
2014-12-24 16:46:19.853 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:46:22.609 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 887
2014-12-24 16:46:29.468 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:46:29.498 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-24 16:46:41.718 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-24T16:46:41
2014-12-24 16:46:45.898 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:46:46.158 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 874
2014-12-24 16:46:46.499 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 874
2014-12-24 16:46:46.857 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 874
2014-12-24 16:46:47.285 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 39
2014-12-24 16:46:47.692 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 22.5
2014-12-24 16:46:48.361 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-24 16:46:57.843 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-24 16:46:59.154 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-24 16:47:12.336 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-24 16:47:13.342 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-24 16:47:14.004 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-24 16:47:14.797 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-24 16:47:15.921 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
^C^C2014-12-24 16:47:28.637 [INFO ] [penhab.io.rest.RESTApplication] - Stopped REST API
2014-12-24 16:47:31.819 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-24 16:47:33.872 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Stopped Classic UI
2014-12-24 16:47:33.949 [INFO ] [.o.io.habmin.HABminApplication] - Stopped HABmin REST API

Furthermore why can I see ^m in the files? I downloaded as zip (directly on Linux and extracted as such) I thought this was windows file behavior -> DOS

Besides that I have mailed Chris the files. Please let me know to test or supply you with any other thing.

cdjackson commented 9 years ago

@tudstudent - the logs looked fine - I've only had a quick look, but there's no obvious duplication at all and everything looked like it was working pretty well. I've got to go out soon and will take a better look tomorrow or Friday, but compared to some of the logs I've seen with massive duplication problems, this is perfect!

@avdleeuw - you said you had the duplication problem. I thought when I looked at your logs recently that it looked ok?

tudstudent commented 9 years ago

I really do not understand... There were initially errors, but it looks like they are vanished...

Well I will keep testing the next couple days. But I do not see anything strange so far.

mebj commented 9 years ago

I am running a Pi with the Razberry board and it has been running well with only few duplications for a couple of months. But today this has changed and the event log shows heaps of duplications. No changes have been made to the system, all this started without any known trigger from outside the system.

Most of it in the log below seems to be bound to the AEON 4 in 1 sensor but there are duplicates from others as well. I will restart the Aeon sensor to see if anything changes.

Examples from the event log are: 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:50 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:51 - Lum_FF_Corridor state updated to 4 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Hum_FF_Corridor state updated to 23 2015-01-28 22:54:51 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:51 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:51 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:54:52 - Temp_FF_Corridor state updated to 22.3 2015-01-28 22:55:33 - Temp_living state updated to 22.3 2015-01-28 22:55:33 - Temp_dining state updated to 21.4 2015-01-28 22:56:37 - Temp_dining state updated to 21.5 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:16 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:17 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:17 - Lum_FF_Corridor state updated to 9 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23 2015-01-28 22:57:17 - Hum_FF_Corridor state updated to 23

cdjackson commented 9 years ago

But why do I get this on Linux (RPI):

2014-12-24 16:46:46.158 [INFO ] [runtime.busevents ] - sensor_1_luminance state updated to 874 2014-12-24 16:46:46.499 [INFO ] [runtime.busevents ] - sensor_1_luminance state updated to 874 2014-12-24 16:46:46.857 [INFO ] [runtime.busevents ] - sensor_1_luminance state updated to 874

I don’t know - all I’m going on is the log you sent - that finishes at 16:40 and I don’t see any obvious duplication. The log viewer looks for duplication and marks them as such, and this isn’t happening. I also went through and checked the incoming data, and again see no duplication.

cdjackson commented 9 years ago

Note that this issue is specifically looking at duplicate messages from the controller to the binding. There are other reasons that you can get duplication - if there are RF issues between the controller and a device, then its possible to get duplications. That said, I doubt it would result in so many duplications, but as always, it's really not possible to say what is happening without a log file - sorry.

I'm happy to look at a log, but the other thing to remember is that what I'm hoping is that the changes I've made a week or two back have helped - if this isn't the case then we're probably back to square one unfortunately...

tudstudent commented 9 years ago

Running a complete night did not give me any duplication. What I will do is do a reboot and try again.

If that is OK, I will do a "reinstall" and try again, because I am 100% sure it was not working at a specific moment (duplicates).

avdleeuw commented 9 years ago

Chris, I sent you two emails with logs on Jan 12. One with the duplication issues before deploying the new jar, and one with duplication issues after deploying the new jar.

Happy to send a new version if you like, I think I'm currently running with the Jan 14 snapshot (I think)

mebj commented 9 years ago

Even if it will not contribute to solving the issue I can mention that my installation went back to usual behavior late last night (again without any known trigger). Only very few duplications until the morning. Then I downloaded and installed latest binding and since then I have not seen any duplication at all. The previous binding (also 1.7.0) was installed around Dec 20th and the current file is dated Jan 27. Seems like there is hope :-) but keeping in mind that it has a habit of coming and going it may be early to feel confident.

cdjackson commented 9 years ago

@mebj thanks for the feedback. I forget exactly when the change was made, but mid Jan… Let me know how it goes - it’s the best hope that I’ve had to solve this so far, so fingers crossed…

cdjackson commented 9 years ago

Chris, I sent you two emails with logs on Jan 12. One with the duplication issues before deploying the new jar, and one with duplication issues after deploying the new jar.

Happy to send a new version if you like, I think I'm currently running with the Jan 14 snapshot (I think)

I have the files, but I thought that there was no duplication - at least not what I refer to as duplication based on what I’ve seen before. If I remember correctly, neither file had any real issues, but I’ll run them through the processor if I can (I think they were very large so I might need to trim them down).=

tudstudent commented 9 years ago

Just to log my test examples. I have at this moment no idea if it works or not. But after my vacation I have a fresh start. This is what i do/did:

  1. Create a new openHAB dir in /opt
  2. Download the snapshot runtime
  3. Download the snapshot addons
  4. Clone the latest GIT Habmin
  5. Create folder openhab_01 (which can be cloned when configuring is done)
  6. Extract runtime to openhab_01 (midnight commander cannot open this zip?, while it can open the addons zip...) Extracted with "unzip"
  7. Copy HABmin to webaps as habmin
  8. Copy z-wave binding from habmin webaps folder
  9. Copy ntp binding from addons zip
  10. Copy habmin binding from habmin webaps folder
  11. Copy my old items, rules, sitemaps, transform, openhab.cfg to config folder (and subfolders)
  12. Create a copy of the openhab_01 folder since all configuring is done (this is the start)
  13. Start openhab_01 without logging enabled (if it jams/duplicates we go to the next level)

Remarks: While starting I am experiencing some errors, see underneath, but after the end of this (last update at 2014-12-25 05:38) All is continuing perfectly. Is this normal?

Launching the openHAB runtime...
osgi> 2014-12-25 05:32:20.971 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-25 05:32:53.682 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-25 05:32:55.374 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-25 05:33:16.361 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-25 05:33:23.038 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-25 05:33:23.535 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-25 05:33:39.008 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-25 05:34:11.154 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-25 05:34:18.191 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-25 05:34:38.772 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-25 05:34:59.613 [INFO ] [.o.io.habmin.HABminApplication] - Started HABmin REST API at /services/habmin
2014-12-25 05:35:08.324 [INFO ] [.z.internal.ZWaveActiveBinding] - Update config, port = /dev/ttyUSB0
2014-12-25 05:35:08.543 [INFO ] [.service.AbstractActiveService] - ZWave Refresh Service has been started
2014-12-25 05:35:09.463 [INFO ] [b.z.i.protocol.ZWaveController] - Starting Z-Wave controller
2014-12-25 05:35:09.470 [INFO ] [b.z.i.protocol.ZWaveController] - Z-Wave timeout is set to 5000ms. Soft reset is false.
2014-12-25 05:35:09.482 [INFO ] [b.z.i.protocol.ZWaveController] - Connecting to serial port /dev/ttyUSB0
2014-12-25 05:35:10.309 [INFO ] [b.z.i.protocol.ZWaveController] - Serial port is initialized
2014-12-25 05:35:12.748 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-25 05:35:14.461 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T05:35:14
2014-12-25 05:35:15.624 [INFO ] [rialApiGetInitDataMessageClass] - NODE 1: Node found
2014-12-25 05:35:15.635 [INFO ] [rialApiGetInitDataMessageClass] - NODE 2: Node found
2014-12-25 05:35:15.645 [INFO ] [rialApiGetInitDataMessageClass] - NODE 3: Node found
2014-12-25 05:35:15.661 [INFO ] [rialApiGetInitDataMessageClass] - ZWave Controller using Controller API
2014-12-25 05:35:15.671 [INFO ] [rialApiGetInitDataMessageClass] - ZWave Controller is Primary Controller
2014-12-25 05:35:15.681 [INFO ] [rialApiGetInitDataMessageClass] - ------------Number of Nodes Found Registered to ZWave Controller------------
2014-12-25 05:35:15.694 [INFO ] [rialApiGetInitDataMessageClass] - # Nodes = 3
2014-12-25 05:35:15.720 [INFO ] [rialApiGetInitDataMessageClass] - ----------------------------------------------------------------------------
2014-12-25 05:35:20.875 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-25 05:35:26.011 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class SWITCH_ALL
2014-12-25 05:35:26.787 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 05:35:26.966 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 704
2014-12-25 05:35:26.975 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-25 05:35:27.306 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-25 05:35:37.639 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 37
2014-12-25 05:35:37.685 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 17.2
2014-12-25 05:35:39.502 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-25 05:35:39.942 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-25 05:35:44.986 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-25 05:35:50.012 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-25 05:35:55.032 [WARN ] [WaveController$ZWaveSendThread] - Receive queue TIMEOUT:
2014-12-25 05:36:02.635 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-25 05:36:04.551 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 05:36:07.382 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 37
2014-12-25 05:36:10.097 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 757
2014-12-25 05:36:12.740 [ERROR] [.o.b.z.i.p.c.ZWaveCommandClass] - Error extracting value - length=9, offset=6, size=7.
2014-12-25 05:36:13.503 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 05:36:13.903 [WARN ] [.b.z.i.c.ZWaveConverterHandler] - NODE 3: No command class found for item = switch_fibaro_01. Class = switch_binary(SWITCH_BINARY), endpoint = 1. Ignoring command.
2014-12-25 05:36:14.575 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T05:36:14
2014-12-25 05:36:14.914 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 05:36:14.967 [WARN ] [.b.z.i.c.ZWaveConverterHandler] - NODE 3: No command class found for item = switch_fibaro_01. Class = switch_binary(SWITCH_BINARY), endpoint = 1. Ignoring command.
2014-12-25 05:36:15.243 [ERROR] [WaveController$ZWaveSendThread] - NODE 2: Timeout while sending message. Requeueing - 2 attempts left!
2014-12-25 05:36:15.263 [ERROR] [b.z.i.p.s.SendDataMessageClass] - NODE 2: Got an error while sending data. Resending message.
2014-12-25 05:36:21.410 [ERROR] [.o.b.z.i.p.c.ZWaveCommandClass] - Error extracting value - length=9, offset=6, size=6.
2014-12-25 05:36:24.074 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class MULTI_INSTANCE_ASSOCIATION
2014-12-25 05:36:24.096 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class SWITCH_ALL
2014-12-25 05:36:24.109 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class FIRMWARE_UPDATE_MD
2014-12-25 05:36:24.122 [WARN ] [.o.b.z.i.p.c.ZWaveCommandClass] - NODE 3: Unsupported command class POWERLEVEL
2014-12-25 05:36:28.386 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 05:36:28.441 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 710
2014-12-25 05:36:28.539 [ERROR] [eController$ZWaveReceiveThread] - Protocol error (CAN), resending
2014-12-25 05:36:28.558 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 710
2014-12-25 05:36:28.661 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 37
2014-12-25 05:36:28.828 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 17.4
2014-12-25 05:36:29.232 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 05:36:29.295 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 05:36:29.343 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 05:36:29.388 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 05:36:29.438 [WARN ] [.z.internal.ZWaveActiveBinding] - NODE 3: No item bound for event, endpoint = 0, command class = SWITCH_BINARY, value = 0, ignoring.
2014-12-25 05:36:29.933 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 05:36:30.156 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 05:36:45.459 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 05:37:14.760 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T05:37:14
2014-12-25 05:37:26.509 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 05:37:26.566 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 3
2014-12-25 05:37:26.672 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 37
2014-12-25 05:37:26.824 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 17.6
2014-12-25 05:37:28.207 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command ON
2014-12-25 05:37:28.255 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to ON
2014-12-25 05:38:14.948 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T05:38:14
2014-12-25 05:38:26.523 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100

I saw this error appearing in the output on screen, how does this happen? Is it because the pi is to slow?

2014-12-25 06:23:43.008 [INFO ] [runtime.busevents             ] - switch_fibaro_02 received command ON
2014-12-25 06:23:43.312 [ERROR] [.r.i.filter.PollingDelayFilter] - null
2014-12-25 06:23:43.358 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to ON
2014-12-25 06:23:51.513 [INFO ] [runtime.busevents             ] - switch_fibaro_02 received command OFF
2014-12-25 06:23:51.899 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 06:23:52.810 [ERROR] [.r.i.filter.PollingDelayFilter] - null
^C2014-12-25 06:24:26.853 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:24:27.018 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 763
  1. Clone openhab_02 to openhab_03
  2. Enable trace logging on openhab_02
  3. Start openhab_02

What I find interesting is that the errors I got in the first run are gone... Explanations? Furthermore I see in the firsy 2 minutes duplicate messages, but after that is runs normally:

Launching the openHAB runtime...
osgi> 2014-12-25 06:39:54.330 [INFO ] [.o.core.internal.CoreActivator] - openHAB runtime has been started (v1.7.0).
2014-12-25 06:40:27.497 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - mDNS service has been started
2014-12-25 06:40:28.811 [INFO ] [o.o.i.s.i.DiscoveryServiceImpl] - Service Discovery initialization completed.
2014-12-25 06:40:49.345 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'switchrelay.items'
2014-12-25 06:40:54.078 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'multisensor.items'
2014-12-25 06:40:54.570 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'networktime.items'
2014-12-25 06:41:09.916 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.sitemap'
2014-12-25 06:41:39.466 [INFO ] [penhab.io.rest.RESTApplication] - Started REST API at /rest
2014-12-25 06:41:46.181 [INFO ] [.o.u.w.i.servlet.WebAppServlet] - Started Classic UI at /openhab.app
2014-12-25 06:42:09.834 [INFO ] [c.internal.ModelRepositoryImpl] - Loading model 'motion.rules'
2014-12-25 06:42:23.466 [INFO ] [.o.io.habmin.HABminApplication] - Started HABmin REST API at /services/habmin
2014-12-25 06:42:29.388 [INFO ] [.service.AbstractActiveService] - NTP Refresh Service has been started
2014-12-25 06:42:30.730 [INFO ] [.service.AbstractActiveService] - ZWave Refresh Service has been started
2014-12-25 06:42:30.844 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T06:42:30
RXTX Warning:  Removing stale lock file. /var/lock/LCK..ttyUSB0
2014-12-25 06:43:27.542 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:43:34.289 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 766
2014-12-25 06:43:34.321 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 34
2014-12-25 06:43:34.333 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 19.8
2014-12-25 06:43:34.354 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T06:43:30
2014-12-25 06:43:34.372 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to CLOSED
2014-12-25 06:43:35.188 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 790
2014-12-25 06:43:40.914 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 34
2014-12-25 06:43:41.620 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 34
2014-12-25 06:43:43.648 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:43:56.137 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:44:07.663 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 06:44:08.268 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 06:44:08.700 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-25 06:44:18.503 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 06:44:19.103 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 06:44:27.466 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:44:27.617 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:44:29.723 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 34
2014-12-25 06:44:30.584 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T06:44:30
2014-12-25 06:44:38.395 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:45:27.376 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 06:45:27.681 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 703
2014-12-25 06:45:28.148 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 34
2014-12-25 06:45:28.337 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 06:45:28.344 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 06:45:28.582 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 19.9
2014-12-25 06:45:30.740 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T06:45:30
2014-12-25 06:46:27.303 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100

From the last run I have a trace log.

cdjackson commented 9 years ago

Remarks: While starting I am experiencing some errors, see underneath, but after the end of this (last update at 2014-12-25 05:38) All is continuing perfectly. Is this normal?

I’ve just installed some meters into my network, and found that this causes problems when the binding starts. I think that messages are being queued up in the stick while the binding isn’t running, and then when we go online we get a bunch of errors until the backlog is cleared. It’s hard to tell if this is your problem or not as the logging isn’t debug, but it’s probably the case (the CAN errors are a good indicator of this issue).

I’ve managed to fix this by adding a delay before sending the initialisation sequence - I’ve only run it a couple of times, but there were no further issues. I used a delay of 2 seconds which ought to be ok, but I might increase it slightly. I’ll upload a new version with this change later today.

tudstudent commented 9 years ago

Hi Chris, what I will do is turn openHAB off, but let my sensors on. This way they will que up some data. When you upload a new revision I will download and check (by that time there will probably be a que) so I can test the "error" again. Will do this on the openhab_02 so I can send you trace logs when applicable.

cdjackson commented 9 years ago

Hi Chris, what I will do is turn openHAB off, but let my sensors on. This way they will que up some data.

Ok - I’ll increase the time shortly and load a new version in about 30 minutes (midday if you’re in Europe)

cdjackson commented 9 years ago

New file is uploaded… Let me know if it looks better...

tudstudent commented 9 years ago

Just tested (and it is still running), trace is on. Seeing duplicates right after starting:

2014-12-25 07:26:45.073 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T07:26:44
2014-12-25 07:27:45.078 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T07:27:45
2014-12-25 07:27:55.358 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 07:27:55.714 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 259
2014-12-25 07:28:05.809 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 259
2014-12-25 07:28:07.244 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 33
2014-12-25 07:28:07.288 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 20.1
2014-12-25 07:28:42.289 [INFO ] [runtime.busevents             ] - sensor_1_motion state updated to OPEN
2014-12-25 07:28:45.290 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T07:28:45
2014-12-25 07:28:48.508 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 07:28:49.287 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 07:28:53.993 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 07:28:54.403 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 282
2014-12-25 07:28:54.882 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 33
2014-12-25 07:28:55.066 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 07:28:55.300 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 20.2
2014-12-25 07:29:45.521 [INFO ] [runtime.busevents             ] - network_datetime state updated to 2014-12-25T07:29:45
2014-12-25 07:29:54.544 [INFO ] [runtime.busevents             ] - sensor_1_battery state updated to 100
2014-12-25 07:29:54.833 [INFO ] [runtime.busevents             ] - sensor_1_luminance state updated to 244
2014-12-25 07:29:55.207 [INFO ] [runtime.busevents             ] - sensor_1_humidity state updated to 33
2014-12-25 07:29:55.528 [INFO ] [runtime.busevents             ] - switch_fibaro_01 state updated to OFF
2014-12-25 07:29:55.541 [INFO ] [runtime.busevents             ] - switch_fibaro_01 received command OFF
2014-12-25 07:29:55.697 [INFO ] [runtime.busevents             ] - sensor_1_temp state updated to 20.2
2014-12-25 07:30:33.722 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 07:30:36.119 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 07:30:37.105 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 07:30:38.159 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 07:30:39.521 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF
2014-12-25 07:30:40.635 [INFO ] [runtime.busevents             ] - switch_fibaro_02 state updated to OFF

It looks like it is only doing this in the first couple of minutes, after that it returns to "normal"

cdjackson commented 9 years ago

Feel free to email me the log when you get a chance and I’ll take a look….

tudstudent commented 9 years ago

Send, it can be that it includes 2 sessions (1 of this morning and 1 from this afternoon). See time reference on this post above with the duplicate lines.

cdjackson commented 9 years ago

Thanks. There is still some errors with the protocol - the binding is sending data when it shouldn't... I'll try and find a fix for this today - I don't think it's a major contributor to the duplicate problem though...

However, going back to the event log above - this isn't 'duplication' - the binding is requesting this data because it's not getting the responses it wants. I've not looked at why yet, but it is ok (well, maybe not ok, but there's no duplication as per the RPi issue we're looking at here).

tudstudent commented 9 years ago

Oke, let me know when I can retest. When we get this solved I can also try in my windows environment, but I am almost 100% sure I did not get the same duplication (as you described above).

cdjackson commented 9 years ago

I am almost 100% sure I did not get the same duplication (as you described above). I’d be very surprised as this is definitely not related to any issue with the Pi - it’s quite clear in the log that it’s looping through the dynamic stage multiple times… This is why it’s only happening during initialisation...

tudstudent commented 9 years ago

I am not going to argue with you ;-) it has been some time ago since I ran it on windows. But lets first fix this and then I will test on windows as well

cdjackson commented 9 years ago

I am not going to argue with you ;-) it has been some time ago since I ran it on windows. But lets first fix this and then I will test on windows as well

Yep - I suspect that this CAN problem will be less of an issue on a faster computer. The issue is there’s a small amount of time between when the frame is received, and the ACK sent. I recently added synchronisation between the queues to try and stop this, but it left a small gap while the checksum was calculated - I’ll try and plug that, but on the Pi, this ‘small’ gap is probably long enough that the other thread jumps in and ruins the party!

I’ve moved the synchronisation - so just testing it to make sure I don’t cause a lockout!

I’ll also look at why your node 3 isn’t completing initialision properly - is it a new Fibaro by any chance?

tudstudent commented 9 years ago

node3= V2.1 fibaro fgs221 Do you need additional info (if yes, please specify the source)

tudstudent commented 9 years ago

@cdjackson did you make any updates recently? I saw an update on habmin github. Is this an adjusted version which I can test (different than the one I replied about on mail to you)