JaapBraam / LoRaWanGateway

A LoRaWan Gateway in LUA
MIT License
225 stars 74 forks source link

rx timeout #22

Closed joscha closed 7 years ago

joscha commented 7 years ago

The Gateway has been working great so far, but lately I get a lot of rx timeouts:

rx timeout  7   rssi    47
rx timeout  7   rssi    46
rxpk    018bbd005ccf7ff42f1c7c47    message {"rxpk":[{"rssi":-60,"stat":1,"modu":"LORA","rfch":1,"tmst":1194235805,"datr":"SF7BW125","lsnr":9,"time":"2017-03-19T06:24:50.564980Z","codr":"4/5","data":"QHkSASaAKwABjE34bZxxnOYQ2zYIWZaqGO5DaF9ckUFbZA==","freq":916.800,"chan":0,"size":34}]}  length  254
rx timeout  7   rssi    45
rx timeout  7   rssi    46
rx timeout  7   rssi    45
rx timeout  7   rssi    48
rx timeout  7   rssi    52
rx timeout  7   rssi    52
rx timeout  7   rssi    46
rx timeout  7   rssi    45
rx timeout  7   rssi    42
rx timeout  7   rssi    54
rx timeout  7   rssi    50
rx timeout  7   rssi    51
rx timeout  7   rssi    46
rx timeout  7   rssi    45
rx timeout  7   rssi    49
rx timeout  7   rssi    46
rx timeout  7   rssi    48
rx timeout  7   rssi    45
rx timeout  7   rssi    51
rx timeout  7   rssi    43
rx timeout  7   rssi    46
rx timeout  7   rssi    42
rx timeout  7   rssi    47
rx timeout  7   rssi    46
rx timeout  7   rssi    46
rx timeout  7   rssi    46

only very rarely messages are transmitted - is there something that might have changed on the TTN side? Or did I do something else wrong when setting it up?

JaapBraam commented 7 years ago

rx timeouts are not influenced by TTN!

An rx timeout occurs when a detected signal does not result the reception of a message. In this gateway an attempt to receive a message is only done after a successful detection of a Lora signal (CADDetected). So there probably is a lora node transmitting something with spreading factor 7 near your gateway (RSSI 46 indicates it is quite nearby)

Messages sent by another gateway will result in rx_timeouts because they use the same preamble signals (so they are detected) but other parameters for the message that makes sure other gateways functionally don't see them.

Messages sent by nodes that are on another (private) network also use the same preamble, but another syncword an therefore won't be received either.

Devices that just use Lora modulation (ie wireless serial devices) also can fool the Lora signal detection causing rx timouts.

Finally: I have seen that LoraWan nodes transmitting on another channel (the previous or next frequency) can cause rx timeouts, especially when they are close to the gateway. Normally those messages cannot be received, but I have even seen the gateway successfully receiving a message on channel 0 that was sent on channel 1 once or twice.

I also get a lot of rx timeouts on my multi SF single channel gateway, but (almost) all messages sent by my test nodes are successfully received.

joscha commented 7 years ago

Thanks for the explanation @JaapBraam - I tested a bit more, I think it is dependent on the message length, is that possible? Is there any way I can increase the timeout in the Gateway to allow longer messages?

JaapBraam commented 7 years ago

SX1276's datasheet says that the CAD function can be used to detect a preamble. I've seen however that when the preamble is missed, the rest of the message will also trigger the CADDetect signal. So if the preamble is missed for whatever reason, the CADDetect will trigger during the rest of the message.

The CAD function uses about 2 symbols to detect a preamble and the RX function will timeout after about 8 symbols, so when sending large messages more RX timeouts will occur if the preamble is missed...

Increasing the timeout is not useful because the SX1286 will not timeout as long as it is receiving a message... The timeout only occurs if the signal is lost or invalid.

joscha commented 7 years ago

I see, thank you! I will dig into why these timeouts happen then - I changed my payload from a custom serialization to nanopb (https://github.com/nanopb/nanopb) and ever since the RX timeouts occur. When I change nothing but the payload, the RX timeouts disappear and as the protobuf message was bigger, I assumed it was because of the length (21 before vs 38 bytes with proto). Maybe there is something in the byte sequence produced by nanopb that trips up the gateway? Will message here if I can find it.

JaapBraam commented 7 years ago

Maybe your node says it will send 38 bytes in the LoRaWan header, but actually sends less? Maybe caused by using a String type data buffer with 00 (string terminator) values in the data?

joscha commented 7 years ago

I seem to be able to transfer up to 22 bytes from this buffer:

uint8_t buffer[] = { 0x8, 0xD1, 0xD6, 0xC6, 0xC6, 0x5, 0x15, 0xB7, 0x9E, 0x7, 0xC2, 0x1D, 0x3F, 0x44, 0x17, 0x43, 0x28, 0xB6, 0x7, 0x30, 0xD1, 0x7, 0x38, 0xA0, 0x29, 0x40, 0x8A, 0x4B };

once I hit the 23rd byte there are only ever timeout errors. Is there any chance you can try transferring this buffer with your gateway @JaapBraam?

JaapBraam commented 7 years ago

I think you are right! Messages longer than 22 bytes are giving problems in my setup too...

I will investigate.

(My test node sends 21 bytes payload and works flawless, so I never noticed)

JaapBraam commented 7 years ago

Thanx @joscha

Found and fixed...

Please test

joscha commented 7 years ago

Ah great, will test it later!

joscha commented 7 years ago

I have been able to successfully transfer a 28-byte payload a couple of times - still getting a lot of timeouts, however, but at least sometimes the transfer is working. I don't understand your fix, feels like 🌟 to me, haha, so I can't say if there still is a problem somewhere.