TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
936 stars 302 forks source link

Allow ADR to lower the datarate of the device #4092

Open i3mSuperMan opened 3 years ago

i3mSuperMan commented 3 years ago

Summary

ADR algorithm implemented in TTS V3 is only increasing the Data Rate when there is good network coverage and not decreasing it even when the network coverage gets worse. Ref: https://github.com/TheThingsIndustries/lorawan-stack-support/issues/379

Steps to Reproduce

...

What do you see now?

Currently, the Network Server doesn't allow the adr_data_rate_index to decrease between the current and the desired parameters. Ref: https://github.com/TheThingsNetwork/lorawan-stack/blob/c5d56b37ed0b8d3486e688a549a1b1386119b935/pkg/networkserver/mac/link_adr.go#L109-L114

The changes have been made to the custom ADR that allows lowering the adr_data_rate_index between the current and the desired parameters. Ref: https://github.com/TheThingsNetwork/lorawan-stack/pull/4065

What do you want to see instead?

ADR mechanism lowering the Data Rate of the device when the network coverage gets worse to improve the QoS.

Environment

TTS v3.12.1

How do you propose to implement this?

...

How do you propose to test this?

...

Can you do this yourself and submit a Pull Request?

No

CC: @johanstokking, @adriansmares, @rvolosatovs

rvolosatovs commented 3 years ago

I really wish I could write down a step-by-step "how to do this right", but I honestly have no idea. Consider consulting @htdvisser regarding this.

What I would try to do first, I guess, would be adding a conditional somewhere here https://github.com/TheThingsNetwork/lorawan-stack/blob/bc960f0a39e0b0934b96e6e0cb0ceb69572ab0f9/pkg/networkserver/mac/adr.go#L224-L240 before safetyMargin is added and if the margin indicates that the signal is not "good enough", then essentially do the reverse of what is happening now in relation to it.

Make sure that the NS does not end up in a loop this way endlessly trying to optimize the parameters without converging. Also check uplink flow and MAC handling for assumptions around data rate index only being increased by NS, since that could potentially result in nasty bugs.

adriansmares commented 3 years ago

cc @htdvisser @johanstokking for thoughts about this (as @htdvisser specified that he would rather avoid this in order to keep the negative feedback loop)

johanstokking commented 3 years ago

In Nicolas Sornin's original ADR algorithm proposal (2016) as well as Semtech's article on ADR, LinkADRReq only contains higher data rates and lower TX powers.

It is assumed that end devices increase TX power and decrease data rate step-by-step when missing a ADR confirmation after ADR_ACK_LIMIT + ADR_ACK_DELAY frames. So, the link will only be restored slowly when packets are already being lost.

The issue raised is that:

  1. Some end devices do not implement increasing TX power and decreasing data rate. Although it is assumed in the aforementioned articles, it is not part of the LoRaWAN specification (let alone certification). It is perfectly legal for end devices to assume that ADR is fully driven by the NS
  2. and/or in some scenarios it is not acceptable that substantial packet loss is needed for end devices to start restoring the link

Therefore, what we should be discussing is whether NS needs to decrease data rate actively. Nicolas' article is clear about this:

The algorithm can only actively increase the data rate. Trying to lower the data rate leads to constantly oscillating values.

Constantly oscillating values is a real problem. If we are sending devices to a lower data rate because the SNR too low, and then back to a higher data rate because the SNR is high enough, things become inefficient.

It seems like we can do a few things:

  1. Build in a solid margin between increasing and decreasing data rate; only if the RF quality is sufficiently low, instruct the end device to decrease data rate.
  2. Don't decrease data rate via ADR. Advise users to use a high SNR margin (per end device) so that data rate doesn't get increased too soon, and/or advise users to use a maximum data rate.

We probably should document this too.

Thoughts?

adriansmares commented 3 years ago

I'm more in favor of option two (don't decreate data rate via ADR). I think that the margin approach basically is self-defeating: when the margin is big enough to avoid oscillation, it probably will be too big to be useful (citation needed).

htdvisser commented 3 years ago

I still think that it's important to keep the negative feedback loop. I would much rather risk losing connection with a couple of devices because we don't send them to a lower data rate than risk sending too many devices to lower data rates, which could have serious consequences for the network overall.

We currently optimize using a target margin of 15dB above the demodulation floor. I think that's around 6 DR steps, so before a device actually loses connection, the radio conditions must have significantly changed.

Of course this also means that we have quite a bit of margin to lower the data rate before this happens, so I can understand the argument for doing that. If we do, it would definitely be crucial to use a margin to slow down the oscillation. I also think that we should then do it both ways, so in addition to the "target margin above demodulation floor" field that we already have in the EndDevice, we would add both a "minimum margin" and "maximum margin" (if unset, perhaps defaulting to something like 3dB below/above the target margin). Then, only if the measured margin goes below the minimum or above the maximum, we would optimize to the target margin. By tweaking these minimum/maximum margins we can further optimize the algorithm to more/less aggressively increase/decrease the data rate.

johanstokking commented 3 years ago

I think we should lower the data rate. RF conditions may vary significantly; it may drastically improve when peering gets enabled, for example. It is also implemented by other LNS so users and customers have some expectations here.

Indeed, I agree it should be configurable with sane defaults. Pushing this to 3.14.0.

nejraselimovic commented 2 years ago

@adriansmares are we still doing this?

adriansmares commented 2 years ago

@adriansmares are we still doing this?

Yes. I've familiarized myself with the algorithm and I think we can do this safely, albeit not by default: it would be something that customers opt in, in order to avoid stranding devices. We change the default MAC settings later to make it a default setting, once it has matured.

What I plan to do is to add the API in 3.14, and behavior in a subsequent patch. I think this is also wise as 3.14 is already a quite big release.