TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
983 stars 310 forks source link

behaviour of ADR #4610

Closed lanmarc77 closed 2 years ago

lanmarc77 commented 3 years ago

Summary

There might be multiple issues here. I deactivated ADR in MCCI LMIC 4.0, still the current public community stack sends an ADR frame down. Additionally that ADR frame sets SF12, even for good rssi snr values. LMIC is not yet protected against ADR frames from a LoRaWAN Server if it does not expect those, so it simply accepts them. I implemented a patch to not accept ADR frames in LMIC if I do not expect them. I can then see an ADR reject message in the console (https://github.com/mcci-catena/arduino-lmic/pull/786). Patch works and the node stays on SF7. After this it is not possible to send any downlinks. Whenever I schedule one it is substituted by another frame with the RX settings. I tried disabling ADR via the cli tool, but this does not change the decribed behaviour. I am out of options. I noticed this because I am switching nodes over to v3.

Steps to Reproduce

  1. Install MCCI LMIC 4.0 use the default ttn-otaa example
  2. deaktivate ADR by calling LMIC_setAdrMode(0);
  3. choose mac 1.0.3

What do you want to see instead?

ADR frames should not be send down to nodes if they do not offer them. SF12 should not be choosen for good rssi/snr values.

How do you propose to test this?

I can offer a test device with LMIC which I could setup for a test network server.

KrishnaIyer commented 3 years ago

I'll try to reproduce this and get back.

lanmarc77 commented 3 years ago

I could get a working version somehow by playing with the cli during the weekend. If I set the following values for the patched mcci lorawan stack which rejects adr frames:

ttn-lw-cli end-devices set %1 %2 --mac-settings.use-adr=false
ttn-lw-cli end-devices set %1 %2 --mac-state.desired-parameters.adr-data-rate-index=DATA_RATE_5
ttn-lw-cli end-devices set %1 %2 --mac-settings.rx1-delay=RX_DELAY_1
ttn-lw-cli end-devices set %1 %2 --mac-settings.desired-rx1-delay=RX_DELAY_1
ttn-lw-cli end-devices set %1 %2 --mac-state.desired-parameters.rx1-delay=RX_DELAY_1
ttn-lw-cli end-devices set %1 %2 --mac-settings.status-count-periodicity=0
ttn-lw-cli end-devices set %1 %2 --mac-settings.status-time-periodicity=0

then the nodes seem to behave like in ttn v2. I waited a night and it seems no additional packets are sent down. Nodes with the non patched version and the same cli settings seem to get additional packets. I can see the down counter go up but do not know yet what kind of packet it is. I assume these are ADR packets as the ADR algorithm did not go into error state but this is pure speculation. Still the main issue seems to be that ADR frames are send to nodes, even if they do not have ADR set.

lanmarc77 commented 3 years ago

I was able to capture on of the downlinks that is being send even if the above settings are made but for a node, that has not actively denied the ADR request. Maybe it helps. grafik

lanmarc77 commented 3 years ago

Small update. I am certain now, that the NS is ignoring the fact that the end device has not set the ADR. It is sending a LinkADRReq. This should no happen if I read the LoRaWAN Spec 1.0.3 (which is my end devices version) correctly. If the end devices nacks all or any of the three requested settings the NS simply ignores this and goes on persisting on sending LinkADRReqs. If the end devices answers the LinkADRReq right away we end up in a sending/receive cycle which fills up the duty cycle of the gw and end device. I can convince the NS to stop sending LinkADRReqs by the above mentioned commands. But this only seems to work for the session not the end device itselfs. Once the end device rejoins e.g. fcnt overflow, all the commands need to be set again. I hope you can confirm this behaviour.

adriansmares commented 3 years ago

Keep in mind that the end-device rejects individual data rate indices / transmission powers / number of retransmissions. Within a singular session, the Network Server will not attempt to send the same tuple again, if the end device rejected the values, but it may attempt to send different tuples that the end device. The end device cannot stop the algorithm unilaterally.

Also LinkADRReq may be used to set the channels of the end device.

I'm interested in the details of the uplinks themselves, i.e. why the ADR algorithm itself is not 'acting right'. If you could provide a series of uplinks (so basically ns.up.data.forward events or as.up.data.forward events) we could look into why the decision to move to SF12 was taken. It could be the case that the NS takes a bit of a margin of safety, and then later on, as more uplinks are available, decides to increase the data rate.

lanmarc77 commented 3 years ago

As usual not much is as it seems at first sight. I could not reproduce the SF12 problem. But this was only one problem. I now discovered that the cause might be the adr setting in the stack. I originally wanted to get the same behaviour of v3 like in v2 for my old mac 1.0.3 nodes. Funny thing, if I set adr to false I get a lot of adr requests, if I set it to true I get none. A bit paradox to me. I used the stack settings from above and set adr to false and true and logged what came in via mqtt without changing the nodes firmware. You can find the logs here for false:https://pastebin.com/BJbFmyR2 and here for true: https://pastebin.com/2W0Zt0d5 The node was configured to send every around 2mins and would only ack the channel mask but not the power or datarate in an adr request. I hope this can be verified in the logs. I suspect an issue in the logic when adr is set to false/true in the stack somewhere.

lanmarc77 commented 2 years ago

Using the current ce stack version it seems I can get more information for debugging. I discovered a paradox behaviour. If I have ADR disabled (under General Settings, Network layer, advanced mac settings) the stack does send Link ADR messages starting after the initial join. The node sits very near the gateway with snr 10, rssi -48. The ADR message only contains the channel list, no power or dr setting. The ADR message is resend again and again, as the node only sends back true for the channel list.

If on the other hand ADR is enabled with the same node the stack does not send any ADR message after the join.

If ADR is disabled and I switch the ADR to enabled while the node stays within the same session, the sending of ADR messages stops and even stays stopped when switching back ADR to disabled as long as the node stays within the same session.

I am puzzled if I even have understood the logic of that ADR switch correctly.

NicolasMrad commented 2 years ago

fixed by #5353