TheThingsNetwork / lorawan-stack

The Things Stack, an Open Source LoRaWAN Network Server
https://www.thethingsindustries.com/stack/
Apache License 2.0
986 stars 310 forks source link

Creating End Devices with non-default factory settings #1378

Closed htdvisser closed 4 years ago

htdvisser commented 5 years ago

Summary

We need to improve the console support and documentation for creating end devices that have non-standard factory settings.

Why do we need this?

The Arduino library for The Things Uno pre-configures it with some settings for TTN when using ABP. However, the v3 NS requires extra settings to make such devices work.

What is already there? What do you see now?

The CLI allows users to specify fields that customize the "reset" in the NS. I already figured out that I need to set --mac-settings.factory-preset-frequencies to make The Things Uno work. I don't know what other things need to be set (@rvolosatovs?).

What is missing? What do you want to see?

  1. We need to document what fields should be specified to make The Things Uno work.
  2. We need to make sure those fields can be specified using the Console.

Environment

How do you propose to implement this?

@rvolosatovs please comment with the fields that need to be specified for The Things Uno. Then @bafonins (or someone else) can add those fields to the Console, after which we can update the documentation.

htdvisser commented 5 years ago

Background: I tried connecting The Things Uno using ABP, but ran into some problems:

  1. NS didn't accept uplink on "extra" channels (fixed with --mac-settings.factory-preset-frequencies, but shouldn't this just work for pre-1.1 devices?)
  2. Downlinks were not received
  3. NS wouldn't accept uplink retransmissions (no downlink, so no acks) DEBUG Skip handling MAC commands for uplink retransmission WARN Retransmission delay exceeds maximum, skip
rvolosatovs commented 5 years ago

What LoRaWAN MAC/PHY versions does The Things Uno use? Which parameters diverge from the LoRaWAN MAC/PHY specs?

htdvisser commented 5 years ago
  1. Not sure, I think it's "mostly 1.0.2", but that also depends on the RN module. We've had a few different versions.
  2. https://github.com/TheThingsNetwork/arduino-device-lib/blob/master/src/TheThingsNetwork.cpp#L628-L818
htdvisser commented 5 years ago

Console part blocked on #579

htdvisser commented 5 years ago

@johanstokking @htdvisser for discussion

johanstokking commented 5 years ago

The Things Uno is not special here. Also it's not about the hardware; it's the TTN Arduino library for the Microchip RN2xx3 module. This is just an example ABP device case.

ABP entails more than DevAddr and session keys. It's everything that would otherwise have been part of the join-accept; frequencies, RX1 delay and RX2 DR.

I think we all agree that it's always best to create a device right the first time, but we should consider "fixing" devices. I think we should make a crystal clear distinction between MAC settings and desired MAC state. The former is just static about the device and is used on state reset, while the latter results in MAC commands.

So, if you forgot setting the factory reset frequencies (MAC settings), you set those. Since this is not desired MAC state and hence doesn't result in generating MAC commands, we must reset the MAC state here. I think that is what users expect when they set stuff like this.

Same goes for RX1 delay and RX2 DR in MAC settings causing a MAC state reset (mac_settings.rx1_delay and mac_settings.rx2_data_rate_index) vs desired MAC state causing MAC commands (mac_settings.desired_rx1_delay and mac_settings.desired_rx2_data_rate_index).

Right?

rvolosatovs commented 5 years ago

I looked into the issue. I used The Things Node, flashed with TheThingsNode.basic modified for ABP in EU868. I discovered that the LoRaWAN MAC/PHY version is 1.0.1.

The create command I used:

ttn-lw-cli device create test-app test-dev --join-eui 4200000000000000 --dev-eui 0004A30B001BC7AF --supports-join false --session.dev_addr 42FFFFFF --session.keys.app_s_key.key 42000000000000000000000000000000 --session.keys.f_nwk_s_int_key.key 42000000000000000000000000000042 --lorawan_phy_version 1.0.1 --lorawan-version 1.0.1 --frequency_plan_id EU_863_870 --mac-settings.factory-preset-frequencies 868100000,868300000,868500000,867100000,867300000,867500000,867700000,867900000 --mac-settings.resets-f-cnt=true --mac-settings.rx2-data-rate-index DATA_RATE_3

Unfortunately, the device only receives at most 1 downlink with default NS config That is due to RxTimingSetupReq being processed by the device, but NS not being acknowledged about it. The are 2 issues causing this(both on the Node side):

  1. RxTimingSetupAns should be "sticky" according to the spec and sent in each uplink on FOpts until a Class A downlink is received. That does not happen - the answer is only sent once. 2019-09-24-22:58:22-screenshot

  2. NS schedules the following list of MAC commands:

    DEBUG Add MAC command to buffer                cid=CID_DEV_STATUS device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_NEW_CHANNEL device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC
    DEBUG Add MAC command to buffer                cid=CID_RX_TIMING_SETUP device_uid=test-app.test-dev mac_version=1.0.1 namespace=networkserver phy_version=1.0.1 started_at=2019-09-24 21:03:30.015269885 +0000 UTC

However, the Node only replies with 4 bytes of payload - 6 255 7 7

I suppose we can hack this into working by setting the desired Rx1 delay to the value equal to PHY to avoid the RxTimingSetupReq being scheduled. I will do more debugging tomorrow.

johanstokking commented 5 years ago

I discovered that the LoRaWAN MAC/PHY version is 1.0.1.

Hmm, how so? If it's what the RN module reports as its version number, then that's about the RN firmware, not the LoRaWAN version. I'm quite sure the RN module is certified for 1.0.2.

rvolosatovs commented 5 years ago

Okay, I will retry with 1.0.2, but the behavior is still clearly incorrect. Regarding https://github.com/TheThingsNetwork/lorawan-stack/issues/1378#issuecomment-534743043: I think the simplest and best solution here is to just allow setting (some of) mac_state.current_parameters fields on the device. I don't think resetting MAC state when one of the MAC settings change is a good option. You can see why by e.g. looking at the issue I encountered at https://github.com/TheThingsNetwork/lorawan-stack/issues/1378#issuecomment-534748795. If the NS sends an RxTimingReq and device accepts that, by resetting MAC state we will lose all downlink connectivity with the device and the only way to fix that, in fact, would be to manually adjust mac_state.current_parameters. Hence, I think that NS shouldn't do anything "extra" like this and just assume the users know what they're doing and allow changing mac_state.current_parameters.

htdvisser commented 5 years ago

@johanstokking: we must reset the MAC state here. I think that is what users expect when they set stuff like this.

I'd prefer this to be explicit (perhaps with a dedicated RPC). Otherwise I think it's also fine to just expect the device to reset (if resets are allowed).


@rvolosatovs: I think the simplest and best solution here is to just allow setting (some of) mac_state.current_parameters fields on the device.

I think this could also be very useful for testing and recovering from other failure scenarios.

johanstokking commented 5 years ago

@rvolosatovs that's fine with me. But this should be clearly documented.


That is due to RxTimingSetupReq being processed by the device, but NS not being acknowledged about it.

What does it do if you only send that MAC command, i.e. not the new channels?

Also, why are you sending the new channels at all? The idea is that these should be in sync because of the factory reset frequencies that it already has.

However, the Node only replies with 4 bytes of payload - 6 255 7 7

What does this mean?

rvolosatovs commented 5 years ago

The NewChannelReq is sent to configure the data rate limits. factory_preset_frequencies only specifies the frequencies, not the data rates device is allowed to use.

6 255 7 7 means DevStatusAns followed by a bogus NewChannelAns The device is supposed to answer with 10 MAC commands, but only sends 1 valid one and 1 invalid one.

rvolosatovs commented 5 years ago

I was unable to debug the MAC operation via Arduino library, it seems that all MAC logic is handled by the RN module, such that library does not have access to it.

johanstokking commented 5 years ago

Indeed, but can you send the MAC commands individually, or at least per type, to see what happens?

rvolosatovs commented 5 years ago

By some reason the Node does not answer NewChannelReqs, which are ineffective(it looks like it does not answer ineffective MAC commands in general). The reason the NewChannelReqs are sent is due to NS not knowing the configured data rate ranges for channels. The issue is mitigated by simply using saner defaults for data rate ranges. See https://github.com/TheThingsNetwork/lorawan-stack/pull/1435

johanstokking commented 5 years ago

What is the status of this issue @rvolosatovs ? What should we discuss?

Reading the spec, NewChannelAns should be sent if the band supports it. The only case where it can be silently dropped is when the band doesn't support the MAC commands, like US and China.

rvolosatovs commented 5 years ago

Yes, unfortunately that does not happen. Since https://github.com/TheThingsNetwork/lorawan-stack/pull/1435 is merged I suppose the only things left are making sure the required fields are available in the console and adding the documentation on how to make the Node and Uno work on v3(see https://github.com/TheThingsNetwork/lorawan-stack/pull/1435) and then we can close the issue.

johanstokking commented 5 years ago

@kschiffer @bafonins is it clear for you which fields are needed for the Console?

If unsure, please loop in @rvolosatovs

rvolosatovs commented 4 years ago

Replaced by https://github.com/TheThingsNetwork/lorawan-stack/issues/2047