Duty cycle management: bucket instead of direct blocking

TMesot commented 2 years ago

Basic station currently enforce duty cycle using "direct" blocking channel algorithm, that can be summarize as :

If a packet is scheduled in a channel, we check if we still have DC available
If so, we send the packets and block the channel according the regulation (0,1;1;10%).

Reference source code is mainly : https://github.com/lorabasics/basicstation/blob/ba4f85d80a438a5c2b659e568cd2d0f0de08e5a7/src/s2e.c#L368 https://github.com/lorabasics/basicstation/blob/ba4f85d80a438a5c2b659e568cd2d0f0de08e5a7/src/s2e.c#L477

This workflow is easy to understand but seems not scaling well. We see a lot of DC error in production, even with a quiet small network. When devices joins network, they may create a bit of traffic (join, ADR, first messages ...) that is more likely to create DC error as channels may be blocked for minutes.

We notice on device side, the LoRa Basics Modem SDK, used a "hourly" bucket for each channel and frequency.

https://github.com/Lora-net/SWSD001/blob/master/lora_basics_modem/lora_basics_modem/smtc_modem_core/lr1mac/src/services/smtc_duty_cycle.c

Is there any reason not to use a bucket duty cycle management for gateway, as for device ? Is this something plan in your roadmap ?

Also I took the opportunity, to ask again for Health status message #46. Currently, when DC is exhausted and blocking on a gateway, it's only reported to logs, and not to network server. It would be very helpful to have this info at NS level ...

beitler commented 1 year ago

Thank you @TMesot for submitting the issue. Indeed, this is an important topic. However, there is no right or wrong way of calculating the DC and both approaches will have pros and cons in different situations. While the windowed approach makes sense for the device, it might not be optimal for the gateway. The gateway does not control the communication pattern of devices and has to ensure fair spectrum access in the downlink direction for all devices. Some devices will have more 'bursty' communication while others will spread their communication more uniformly in time. In a windowed DC approach devices with 'bursty' communication have the potential to starve devices with uniform communication. Therefore, it seems the greedy DC blocking approach is preferable from a fairness perspective. What do you think?

The most important aspect, however, is probably to avoid mismatches in the DC management between the LNS and the Gateway like described in https://github.com/TheThingsNetwork/lorawan-stack/issues/5844. This is something we need to sort out.

TMesot commented 1 year ago

Hello, Thanks for your reply.

This is in interesting point of view, I had not think this globally as we operate private network but it may make sense for a public network to have a more "linear" blocking over time as you don't really control devices. It would be interesting to have TTN point of view (@adriansmares), as they operate a community network and have also a hourly bucket in their scheduler.

For me, the main issue that leads to this situation is the leak of feedback if basicstation drop a message, related to #46. If this is solve, the LNS would be able to reschedule the downlink in another GW, if dropped, even with the restrictive scheduler.

beitler commented 1 year ago

I think what we should do here is to allow the nodc config to be set also for prod builds. This would allow the LNS to fully control the duty cycle management and avoid the two different DC management approaches going out of phase.

adriansmares commented 1 year ago

We are currently using the Basic Station duty cycle management for Basic Station gateways, on top of our bucketing approach.

While allowing nodc in production would be useful, I suggest considering a feature flag that would signal that nodc can be used. Otherwise it is hard to discern between which gateways still have to use the direct blocking limitations and the ones which do not have such limitations - the only alternative is parsing the version numbers, and I've already seen in the wild station builds which have various increments like 2.8.6 or 2.6.6.

In my opinion, in an ideal world the LNS has the possibility to still use the direct blocking approach for pre-2.0.7 stations, while allowing newer releases to be fully LNS driven. In the absence of a feature flag, the LNS is basically forced to assume that the gateway will reject the nodc in a production build.

beitler commented 1 year ago

Thanks for this feedback, @adriansmares.

and I've already seen in the wild station builds which have various increments like 2.8.6 or 2.6.6.

This is concerning.

lorabasics / basicstation

Duty cycle management: bucket instead of direct blocking #164