maxwellhadley / node-red-contrib-rfxcom

node-RED nodes to access an RFXtrx433 transceiver
BSD 2-Clause "Simplified" License
22 stars 13 forks source link

RFXCOM: ACK - transmit delayed (1) #40

Closed Mastiffen closed 6 years ago

Mastiffen commented 6 years ago

I keep getting a bunch of these, plus some "RFXCOM: Timed out waiting for response (6)". Do you have any idea what this can be? I have 20 temp sensors that sends, and I am sending out AC commands to six switches. Is it because this is too much? I did not have any problems with this when I had the RFXtrx connected directly to EventGhost. I am now running a Pi setup with the RFX, Tellstick and Z-Stick controlled by Node-RED and then MQTT to EventGhost, where the "hevy lifting" of the scripts is done (pretty heavy, my heating control script is 600 lines). In Node-RED it's just this script to convert Viking sensors to MQTT messages:

`var protocol = "Viking";

var tmp = msg.payload.temperature.value;

var id = msg.topic.split('/')[1].split('x')[1];

id = parseInt(id, 16);

msg.payload = protocol+','+id+','+tmp;

return msg;` (Without the extra line feeds, for some reason the code view insisted to putt everything on one line here with only one LF between the lines.) And then a small script that changes the MQTT messages from EG to Node-RED into signals to the AC switches. So there really should be no reason to think that the Pi is overloaded.

maxwellhadley commented 6 years ago

It's not the Pi that is overloaded, it's the RFXtrx433E!

The RFXCOM: ACK - transmit delayed (1) message is one of the possible response codes of the RFXtrx433E to a command message sent to it from the Pi. I have never seen one happen myself, but as I understand it, it means that the command message has been received OK, and is no longer in the (400-byte-ish) RFXtrx433E serial input buffer, but it cannot be transmitted yet because the radio channel is not clear - either the RFXtrx433E is hearing other devices transmitting, or previously-received commands are still being transmitted by the RFXtrx433E itself. This should be OK, it is not an error. Presumably, the message is transmitted eventually...

The RFXCOM: Timed out waiting for response (6) message is generated by my software if the Pi has not received a response from the RFXtrx433E in reply to a message (generally a command message, but there are a couple of other message types this applies to as well) after 10 seconds. In theory, this should never happen, but it is possible the RFXtrx433E may fail to respond in the time allowed if the channel is very busy. My software never sends more than 3 messages to the RFX433E at a time: it will wait for either a response, or for the timeout, before sending the 4th message. This avoids over-running the RFXtrx433E's serial input buffer. Everything should just recover by itself after a timeout, but it is possible the message that was not responded to didn't get processed correctly by the RFXtrx433E - which would be a firmware bug.

If you have the 'retransmit after a delay' option enabled on your rfx-lights-out node, that will of course increase the number of transmitted messages, making it more likely.

Mastiffen commented 6 years ago

OK, thanks. Weird, I never saw anything like this in my EG setup, but that may be because the plug-in wasn't set to show them. It doesn't stop transmitting or receiveing, the messages just comes about once a minute. But they are all marked with an input node. The input node where it comes in, is the one for my Viking sensors. It's set to only listen to the pattern TEMP7. Also there's another one that I am not sure if is related. This one comes on the same input node:

RFXCOM: Command unknown or not supported by this device (4)

And the Temp2, as I said it's kind of difficult to get to. I have to wait another hour before I can get in there. I'll let you know in the other issue thread about that one.

maxwellhadley commented 6 years ago

The node that is referenced in the messages is just whichever is the first node-rfx-xxx node to be created when your flow is deployed, as that is the node that causes the node-rfxcom object to be created, which in turn opens the RFXtrx433E device.

The RFXCOM: Command unknown or not supported by this device (4) message usually means the command sent requires the 'other' firmware type - i.e. you have sent a command message that requires type 1 firmware and you have type 2 installed, or the other way round. Or that you have sent a command requiring the 'E' hardware to an old-style RFXtrx433 (actually there are a few commands that only work on the old hardware, but node-red-contrib-rfxcom doesn't support them).

Mastiffen commented 6 years ago

Aha, I see! I thought that was the node it was coming from. I have an E myself. But I find the unknown command rather strange, because as far as I know I'm sending exactly the same command all the time. I may have to do a bit of digging in the scripts to see where that comes from.

maxwellhadley commented 6 years ago

It may, of course, be a bug...

maxwellhadley commented 6 years ago

Are you still having trouble? If not, I would like to close this issue

Mastiffen commented 6 years ago

I still have the messages both at the instance in the house and at the cabin, but I am not really sure if it is a problem. But you can close this. I'll just let you know if I find a problem with it.

maxwellhadley commented 6 years ago

OK thanks!

maxwellhadley commented 6 years ago

I just discovered that recent versions of the RFXCOM firmware have changed the allowable limits for one parameter in the command message for the LightwaveRF protocol, so commands that used to work OK may now return this error. I don't know if you are using that protocol, but if you give me a list of all the output types you are using, I will check to see if there any other changes.

If you can re-start Node-RED with the environment variable set: RED_DEBUG=rfxcom all the communication with the RFXtrx433E will be dumped on the console. You can search through that & look for the error message text - the offending command will then be immediately identifiable. The fourth byte in the 'Sent' line will be the same value as the Command message number, as in this example of a correctly understood message:

[rfxcom] on /dev/cu.usbserial-A1R1A6A - Queued  : 0A,14,0A,03,00,01,23,01,03,00,00
[rfxcom] on /dev/cu.usbserial-A1R1A6A - Sent    : 0A,14,0A,03,00,01,23,01,03,00,00
[rfxcom] on /dev/cu.usbserial-A1R1A6A - Received: 04,02,01,03,00
[rfxcom] on /dev/cu.usbserial-A1R1A6A - Response: Command message 03, ACK - transmit OK
Mastiffen commented 6 years ago

Actually that's not difficult to say. :) The only output I'm using is AC switches on/off. But I will run it with debug and see what happens. At the cabin I have a Type 1 with firmware 95, while at home I have an Ext 2 with 1022 firmware. I'll be back with the info from the debug run.

Mastiffen commented 6 years ago

Seems like I didn't know how to do that... I thought it was like this:

node-red-start RED_DEBUG=rfxcom

But it isn't. So I'm afraid I'll have to show what a total newbie I am again and ask how I put in that variable.

maxwellhadley commented 6 years ago

Nearly right! the environment variable comes before the command:

RED_DEBUG=rfxcom node-red-start
Mastiffen commented 6 years ago

Weird. Illogical to (with the variable before the command), but also weird: I don't see any debug information. Not in the browser, and not in the console on the Pi. And I have stuff coming in and going out all the time. Is there something else I need to activate?

maxwellhadley commented 6 years ago

Ah - you are running node-red from systemd. In that case the console output goes to the system log file, and you can view it using:

sudo journalctl -f -u nodered -o cat

But I don't think that will help, because only the node-red-start script will get the RED_DEBUG environment variable, and I don't think it will pass it on. Have a look at the log file, and if you can't see any lines that contain the text [rfxcom] then stop Node-RED, and run it interactively using the node-red-pi command

Mastiffen commented 6 years ago

You're right, the file didn't show anything. But with RED_DEBUG=rfxcom node-red-pi it does. The Pi console is writing like crazy. I'll let that run until Ihave one of those error messages in the logger in the browser, and then I'll scroll back to find the correct command.

Mastiffen commented 6 years ago

The first one came here: rfxcom] on /dev/ttyUSB0 - Sent : 0B,11,00,13,00,8D,30,3A,09,00,00,00 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,E7,00,8D,30,3A,09,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 04,02,01,13,01 [rfxcom] on /dev/ttyUSB0 - Response: Command message 13, ACK - transmit delayed 18 Mar 14:07:41 - [warn] [rfx-sensor:Viking-sensorer] RFXCOM: ACK - transmit delayed (1)

But it actually seems like this is a receive, not a send. Can this be an error from MQTT, and not from RFX?

Mastiffen commented 6 years ago

And here the second one came. I see queued there, is that related?

[rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,04,00,8D,30,3A,05,00,00,60 [rfxcom] on /dev/ttyUSB0 - Queued : 0B,11,00,16,00,8D,30,3A,05,00,00,00 [rfxcom] on /dev/ttyUSB0 - Sent : 0B,11,00,16,00,8D,30,3A,05,00,00,00 [rfxcom] on /dev/ttyUSB0 - Received: 04,02,01,16,01 [rfxcom] on /dev/ttyUSB0 - Response: Command message 16, ACK - transmit delayed 18 Mar 14:10:26 - [warn] [rfx-sensor:Viking-sensorer] RFXCOM: ACK - transmit delayed (1) [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,05,00,8D,30,3A,05,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,06,EE,00,80,BF,69 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,07,99,00,00,CC,69

maxwellhadley commented 6 years ago

Those look to me as if something else is trying to talk to the same devices, possibly you have a radio 'booster' repeater?

In the second example, the first line is a received message, packet type 0x11, subtype 0x00 (KlikAanKlikUit?) to device address 0x8D303A/5, 'turn off'. This has the (received) message sequence number 0x04. On the next line the same message is being queued for sending to the RFXtrx433E, this time with (transmit) message sequence number 0x16. This is then sent out (line 3) and the RFXtrx433E responds with a 'transmit delayed' code - you see this three times, once as the actual bytes coming in; then once as the warning message generated by node-rfxcom; then finally once more from node-RED. Then the next thing that happens is that this same message is received again(!) with sequence number 0x05, followed by two Viking temperature sensor messages from different addresses with sequence numbers 0x06 & 0x07 and readings -19.1 degrees C & +20.4 degrees C.

Somehow or other you have created a loop. Command messages received by an rfx-lights-in node from something else are being sent to an rfx-lights-out node and being retransmitted. The something else then retransmits them again! Probably these messages are just going round and round continuously - no wonder you get so much output, and that is probably why you get these transmit delayed response codes. It might explain the bad command and timeout messages as well. note that the RFXtrx433E does not 'hear' the messages it transmits itself.

Is the little orange light on your RFXtrx433E flashing all the time?

Edit to add:

I see you have been using a Tellstick as well the RFXtrx433E. If that is still active it may be the culprit, especially if both are talking to MQTT

maxwellhadley commented 6 years ago

Any news?

Mastiffen commented 6 years ago

Sorry, I have been so extremely busy with work the last days. I'm a freelance translator, so I have to work like crazy when I get any. I do not have a booster or anything, btw. I have a Telldus Tellstick Duo, that receives the same signals, but that doesn't send any Viking stuff. I will try to check the RFXtrx today or tomorrow to see if the light flashes like crazy, but I have around 15 Viking sensors in one of my locations and 20 in the other, so there will be a lot of stuff going through the air. Can it be an error on a sensor?

As for KlikAanKlikUit I have no idea what that is... But the received message may very well be the Tellstick sending, I have that sending the same on and off signals as the RFX as a backup.

maxwellhadley commented 6 years ago

KlikAanKlikUit is just the brand name of a range of AC-protocol mains sockets, one among many.

If the Tellstick is sending the same messages as the RFXCOM, the RFXCOM will receive the Tellstick's transmissions. If your flow sends these back to the Tellstick, either directly, or more probably via MQTT, then they will end up being echoed indefinitely round and round and round... The same thing would happen if the Tellstick received the RFXCOM's transmissions. I think this is what is going on, in both directions.

If so, I'm amazed it works at all!

You can set the rfx-lights-out node to retransmit messages automatically to improve reliability, either once after a selectable delay, or repeatedly at a selectable interval. However this does not provide the spatial diversity you get by using the Tellstick, assuming it is in a different place to the RFXCOM. If you can find and remove the loop in your flow somehow, it should be OK to keep using the Tellstick.

Mastiffen commented 6 years ago

I actually had the same idea right after I answered over. :) So what I did was to change the script, so it fired the Tellstic AFTER the RFX, instead of before. Also I have put in half a second of pause between the RFX send and the Tellstick send. The Tellstick sends each command more then once too, so I'm sure that has messed up something. In the time that's gone since I did that I have had no error messages. Crossing my fingers! If I see error messages I'll put in a bit longer pause, but I think this may have fixed it.

Mastiffen commented 6 years ago

No, it comes in at times still. Not as often. Here's one, if this tells you anything:

rfxcom] on /dev/ttyUSB0 - Response: Command message F6, ACK - transmit OK [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,B8,08,00,00,A4,69 [rfxcom] on /dev/ttyUSB0 - Queued : 0B,11,00,F7,00,8D,30,3A,07,00,00,00 [rfxcom] on /dev/ttyUSB0 - Sent : 0B,11,00,F7,00,8D,30,3A,07,00,00,00 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,B9,00,8D,30,3A,08,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BA,00,8D,30,3A,07,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BB,00,8D,30,3A,08,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 04,02,01,F7,01 [rfxcom] on /dev/ttyUSB0 - Response: Command message F7, ACK - transmit delayed 21 Mar 15:13:21 - [warn] [rfx-sensor:Viking-sensorer] RFXCOM: ACK - transmit delayed (1) [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BC,00,8D,30,3A,07,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BD,00,8D,30,3A,08,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BE,00,8D,30,3A,07,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,BF,00,8D,30,3A,07,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 0B,11,00,C0,00,8D,30,3A,07,00,00,60 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,C1,FC,00,00,AE,69 [rfxcom] on /dev/ttyUSB0 - Received: 0D,54,01,C2,B3,01,00,D7,20,02,03,EC,00,70 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,C3,D5,00,00,F3,69 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,C4,99,00,00,C8,59 [rfxcom] on /dev/ttyUSB0 - Received: 08,50,07,C5,41,00,00,35,59

maxwellhadley commented 6 years ago

I don't think that one is a problem. The Tellstick seems to send 5 repeats of each command message, and it appears the RFXCOM was unable to transmit the message [0B,11,00,F7,00,8D,30,3A,07,00,00,00] immediately because the Tellstick was still transmitting the last two repeats of the previous message [0B,11,00,B9,00,8D,30,3A,08,00,00,60]. The Tellstick seems to wait quite a while between sending each repeat.

Everything else looks normal to me. I think that since this response (with code 1) is not actually an error, I shouldn't generate a warning message in Node-RED. I will fix this in the next version. I'll close the issue when I publish it.

Mastiffen commented 6 years ago

Is that possible? I have the Tellstick transmitting half a second after the RFX. Shouldn't that avoid this problem? Or is it sending another signal, and not the same signal?

maxwellhadley commented 6 years ago

Yes, it's a different message: the first one goes to unit code 08 and the next one goes to unit code 07:

Mastiffen commented 6 years ago

Aha, I see. Then it's the next room. Well, with all those signals flying through the air here I guess some of the temp sensors has to sync up. Thanks for the help!

maxwellhadley commented 6 years ago

Version 2.4.0 published to npm. Response code 1 is no longer treated as an error condition, and will not now give rise to a warning message