brocaar / chirpstack-network-server

ChirpStack Network Server is an open-source LoRaWAN network-server.
https://www.chirpstack.io
MIT License
1.49k stars 546 forks source link

Wrong RX2 response timestamp #393

Closed iwakidi closed 5 years ago

iwakidi commented 5 years ago

Hi,

We realised while performing tests on RX2 implementation that there was a bug on the timestamp that is sent back to the gateway.

Test Setting

  1. We have configured the network server on mode 0 (RX1 with a fallback on RX2)
  2. We have deliberatly introduced more than 1 second latency to trigger a downlink on an RX2 window

What we expected

When there is a latency of over 1 second (default RX1 delay), we expect the timestamp of the downlink sent on the RX2 window to be at :

RX1_timestamp + 1 second (equivalent to uplink_timestamp +2 seconds )

What happened

After looking at the payload sent to the gateway, we found out that both timestamps used for RX1 and RX2 are identical, using the RX1 value _(uplinktimestamp +1 second). That means that Loraserver asks the gateway to send the RX2 message on the RX1 window, which the gateway will never do because it will always be too late.

Note

When on mode 0, all the meta data of the RX2 downlink are correct (dataRate, frequency), which means the RX2 is implemented correctly, only the timestamp is wrong

When we force the network server on mode 2 ( RX2 only), everything seems to work correclty.

What version are we using?

Loraserver : v2.8.1 App Server : v2.6.1 Gateway Bridge : v2.7.1

Imad from ITK.

brocaar commented 5 years ago

Please note that this feature makes use of the ACK / nACK feature of the packet-forwarder. LoRa Server is (currently) not aware of the latency between the gateway and the network.

How this works when set to RX1 with RX2 fallback:

RX1:

RX2:

In your case, did you see two (duplicated) downlinks with RX1 parameters? Or were you expecting to see a downlink with RX2 parameters immediately?

HobaiRiku commented 5 years ago

Oh, god, I was stuck in reading that HandleResponse list in code, can't find out how ctx.DownlinkFrames[i] work for this feature, thank for this issue and the explaining by @brocaar , finally figure out.
btw, I don't understand that setting deliberatly introduced more than 1 second latency to trigger a downlink on an RX2 window , for class A, loraserver always checkout downlink from queue after an uplink, i think we can't trigger downlink by custom, am I correct?

ghost commented 5 years ago

@HobaiRiku For test purposes (simulating a slow 3G connection, actually), we need to insert network lantency in our test gateway. We achieved it by insterting a Linux PC with two network adapters between the gateway and our routers, and sending it the command tc qdisc add dev eno1 root netem delay <delay>ms

As explained by brocaar, with a round trip latency greater than one second, the gateways's packet forwarder will send a nACK to any RX1 downlink request, thus triggering loraserver to request a RX2.

@brocaar Is the explaination above OK ?

Looking at the packet forwarder's logs, we observed :

The error is in the timestamp requested by loraserver for RX2 in mode 0 : same as RX1 instead of RX1 + 1s. In mode 2, the timestamp is ok : TX + 2s Everything else seems to work well.

Clément from ITK

HobaiRiku commented 5 years ago

@zclem I have look into sourcecode:

https://github.com/brocaar/loraserver/blob/2d023e03c078de62a6deab2d20cf96caa117b55f/internal/downlink/data/data.go#L476-L488
for ctx.DownlinkFrames[1] (when mode=0), it run into setTXInfoForRX2 and it will meet the code above, it use ctx.DeviceSession.RXDelay for rx2 delay, which is init with conf.NetworkServer.NetworkSettings.RX1Delay (otaa join) or deviceProfile.RXDelay1 (ABP), and then I think ctx.DownlinkFrames[1] will be popup and sent when nACK come, using the same delay. I'm not sure about it, but is it the problem?

brocaar commented 5 years ago

The error is in the timestamp requested by loraserver for RX2 in mode 0 : same as RX1 instead of RX1 + 1s.

I have tried to reproduce the issue (easiest way to do this is to set the de-duplication delay in loraserver.toml to > 1 second so the downlink will always fail), but all looks fine:

1) Uplink (timestamp: 1369124172)

gateway/00800000a00016b6/rx {"rxInfo":{"mac":"00800000a00016b6","time":"2019-07-02T08:59:09.295012Z","timeSinceGPSEpoch":"346136h59m28.295s","timestamp":1369124172,"frequency":868300000,"channel":1,"rfChain":1,"crcStatus":1,"codeRate":"4/5","rssi":-35,"loRaSNR":6.8,"size":16,"dataRate":{"modulation":"LORA","spreadFactor":12,"bandwidth":125},"board":0,"antenna":0},"phyPayload":"QJRVBgCCBQADBwH9ejbVbA=="}

2) Downlink 1 (timestamp: 1370124172 = Uplink timestamp + 1000000)

gateway/00800000a00016b6/tx {"token":63253,"txInfo":{"mac":"00800000a00016b6","immediately":false,"timestamp":1370124172,"frequency":868300000,"power":14,"dataRate":{"modulation":"LORA","spreadFactor":12,"bandwidth":125},"codeRate":"4/5","iPol":true,"board":0,"antenna":0},"phyPayload":"YJRVBgCFBAADVQcAASgJjlk="}

3) Negative ACK:

gateway/00800000a00016b6/ack {"mac":"00800000a00016b6","token":63253,"error":"TOO_EARLY"}

4) Downlink 2 (timestamp 1371124172 = Uplink timestamp + 2000000)

gateway/00800000a00016b6/tx {"token":63253,"txInfo":{"mac":"00800000a00016b6","immediately":false,"timestamp":1371124172,"frequency":869525000,"power":14,"dataRate":{"modulation":"LORA","spreadFactor":12,"bandwidth":125},"codeRate":"4/5","iPol":true,"board":0,"antenna":0},"phyPayload":"YJRVBgCFBAADVQcAASgJjlk="}

LoRa Server: 2.8.2 LoRa App Server: 2.6.1 LoRa Gateway Bridge: 2.7.1

brocaar commented 5 years ago

Btw, when a long delay between the gateway and NS is common, I suggest to set the rx1_delay config value in loraserver.toml to e.g. 2 or more. That way RX1 will occur at rx1_delay and RX2 at rx1_delay + 1 second. That way the downlinks will be distributed across all the configured channels instead of just the RX2 frequency.

brocaar commented 5 years ago

I'm closing this issue as I think the above comments explain the situation. Please let met know if this should be re-opened.