jgromes / RadioLib

Universal wireless communication library for embedded devices
https://jgromes.github.io/RadioLib/
MIT License
1.56k stars 389 forks source link

Seeed-Studio LoRa-E5-Mini LoRaWAN OTAA timeout #844

Closed davidfobar closed 1 year ago

davidfobar commented 1 year ago

Using the E5 dev board (https://www.seeedstudio.com/LoRa-E5-mini-STM32WLE5JC-p-4869.html), I can get chirpstack to recieve a JoinRequest and issue a JoinAccept however RadioLib provides a timeout error (-6).

I don't know if I am setting the RFSwitches correctly:

////////////////////////////////////////////////////////////////////////////////////////////

#include <Arduino.h>
#include <RadioLib.h>

// no need to configure pins, signals are routed to the radio internally
STM32WLx radio = new STM32WLx_Module();

// create the node instance on the US-915 band
// using the radio module and the encryption key
// make sure you are using the correct band
// based on your geographical location!
LoRaWANNode node(&radio, &US915);

// using PA0 as a placeholder since the E5-mini does not have the PC3 connection
//ref: https://github.com/Seeed-Studio/LoRaWan-E5-Node
static const uint32_t rfswitch_pins[] = {PA0, PA4, PA5}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA0, PA5, PA4}; // no response
//static const uint32_t rfswitch_pins[] = {PA4, PA0, PA5}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA4, PA5, PA0}; // no response
//static const uint32_t rfswitch_pins[] = {PA5, PA0, PA4}; //TX received in chirpstack
//static const uint32_t rfswitch_pins[] = {PA5, PA4, PA0}; //TX received in chirpstack

static const Module::RfSwitchMode_t rfswitch_table[] = {
  {STM32WLx::MODE_IDLE,  {LOW,  LOW,  LOW}},
  {STM32WLx::MODE_RX,    {HIGH, HIGH, LOW}},
  {STM32WLx::MODE_TX_LP, {HIGH, HIGH, HIGH}},
  {STM32WLx::MODE_TX_HP, {HIGH, LOW,  HIGH}},
  END_OF_MODE_TABLE,
};

void setup() {
  Serial.begin(115200);

  // set RF switch control configuration
  // this has to be done prior to calling begin()
  radio.setRfSwitchTable(rfswitch_pins, rfswitch_table);

  int state = radio.begin();
  if(state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while(true);
  }

  // first we need to initialize the device storage
  // this will reset all persistently stored parameters
  // NOTE: This should only be done once prior to first joining a network!
  //       After wiping persistent storage, you will also have to reset
  //       the end device in TTN and perform the join procedure again!
  // node.wipe();

  // application identifier - pre-LoRaWAN 1.1.0, this was called appEUI
  // when adding new end device in TTN, you will have to enter this number
  // you can pick any number you want, but it has to be unique
  uint64_t joinEUI = 0x12AD1011B0C0FFEE;

  // device identifier - this number can be anything
  // when adding new end device in TTN, you can generate this number,
  // or you can set any value you want, provided it is also unique
  uint64_t devEUI = 0x70B3D57ED005E120;

  // select some encryption keys which will be used to secure the communication
  // there are two of them - network key and application key
  // because LoRaWAN uses AES-128, the key MUST be 16 bytes (or characters) long

  // network key is the ASCII string "topSecretKey1234"
  uint8_t nwkKey[] = { 0x74, 0x6F, 0x70, 0x53, 0x65, 0x63, 0x72, 0x65,
                       0x74, 0x4B, 0x65, 0x79, 0x31, 0x32, 0x33, 0x34 };
                       //746F705365637265744B657931323334

  // application key is the ASCII string "aDifferentKeyABC"
  uint8_t appKey[] = { 0x61, 0x44, 0x69, 0x66, 0x66, 0x65, 0x72, 0x65,
                       0x6E, 0x74, 0x4B, 0x65, 0x79, 0x41, 0x42, 0x43 };
                       //61446966666572656E744B6579414243

  // prior to LoRaWAN 1.1.0, only a single "nwkKey" is used
  // when connecting to LoRaWAN 1.0 network, "appKey" will be disregarded
  // and can be set to NULL

  // some frequency bands only use a subset of the available channels
  // you can set the starting channel and their number
  // for example, the following corresponds to US915 FSB2 in TTN
  /*
    node.startChannel = 8;
    node.numChannels = 8;
  */

  // now we can start the activation
  // this can take up to 20 seconds, and requires a LoRaWAN gateway in range
  Serial.print(F("[LoRaWAN] Attempting over-the-air activation ... "));
  state = node.beginOTAA(joinEUI, devEUI, nwkKey, appKey);
  if(state == RADIOLIB_ERR_NONE) {
    Serial.println(F("success!"));
  } else {
    Serial.print(F("failed, code "));
    Serial.println(state);
    while(true);
  }

/////////////////////////////////////////////////////////////////////////////////////////// The remainder of the STM32WLx_Transmit_Interrupt.ino code is unchanged.

Serial terminal output:

[LoRa-E5] Initializing ... success!
[LoRaWAN] Attempting over-the-air activation ... failed, code -6

///////////////////////////////////////////////////////////////////////////////////////////

Is this a RF Switch issue?

jgromes commented 1 year ago

Is this a RF Switch issue?

Seems likely, considering that the E5 board only seems to have two RF switch control pins instead of the 3 used by Nucleo STM32WL. So it would suggest it does not have the high-power/low-power transmit modes of the original STM32WL (and therefore you would have to modify the rfswitch_table), but that's just a guess on my side, probably best to clarify with the manufacturer.

Another thing is that I haven't tested against chirpstack, just TTN. Should be noted that in TTN, for US-915 frequencies you have to select a subset of all the available channels to be used. Is that also the case in chirpstack?

Also, maybe try eanbling debug mode if there is more information.

davidfobar commented 1 year ago

Here is a JoinRequest and JoinAccept from Chirpstack: [ { "rxInfo": [ { "gatewayID": "LPfxEUEQAA4=", "time": "2023-10-11T17:45:24.901554010Z", "timeSinceGPSEpoch": null, "rssi": -113, "loRaSNR": -7.2, "channel": 6, "rfChain": 1, "board": 0, "antenna": 0, "location": { "latitude": 0, "longitude": 0, "altitude": 0, "source": "UNKNOWN", "accuracy": 0 }, "fineTimestampType": "NONE", "context": "MFmmIA==", "uplinkID": "s+zHL1pyRG2krTT5F10EVg==", "crcStatus": "CRC_OK" } ], "txInfo": { "frequency": 905100000, "modulation": "LORA", "loRaModulationInfo": { "bandwidth": 125, "spreadingFactor": 10, "codeRate": "4/5", "polarizationInversion": false } }, "phyPayload": { "mhdr": { "mType": "JoinRequest", "major": "LoRaWANR1" }, "macPayload": { "joinEUI": "12ad1011b0c0ffee", "devEUI": "70b3d57ed005e120", "devNonce": 0 }, "mic": "1fad4265" } }, { "txInfo": { "frequency": 925100000, "power": 20, "modulation": "LORA", "loRaModulationInfo": { "bandwidth": 500, "spreadingFactor": 10, "codeRate": "4/5", "polarizationInversion": true }, "board": 0, "antenna": 0, "timing": "DELAY", "delayTimingInfo": { "delay": "5s" }, "context": "LpnAIQ==" }, "phyPayload": { "mhdr": { "mType": "JoinAccept", "major": "LoRaWANR1" }, "macPayload": { "bytes": "YxrZt0DgIXMeTdKMoMlTlxc0TYTD0Snbdb3E0w==" }, "mic": "18242c24" } } ]

davidfobar commented 1 year ago

Also, from an example provided by STM32CubeIDE I was able to figure out the RFSwitchTable, but it didn't seem to help yet. I am still using PA0 as a placeholder since RadioLib expects the table to have 3 pins - changing RFSWITCH_MAX_PINS to 2 does not compile, there are parts of the library that still expect 3 values.

static const uint32_t rfswitch_pins[] = {PA4, PA5, PA0};

static const Module::RfSwitchMode_t rfswitch_table[] = { {STM32WLx::MODE_IDLE, {LOW, LOW, LOW}}, {STM32WLx::MODE_RX, {HIGH, LOW, LOW}}, {STM32WLx::MODE_TX_LP, {HIGH, HIGH, LOW}}, {STM32WLx::MODE_TX_HP, {LOW, HIGH, LOW}}, END_OF_MODE_TABLE, };

jgromes commented 1 year ago

You're not meant to change the value of RFSWITCH_MAX_PINS. But there's nothing preventing you from only using two pins for the RF switch, RadioLib has a "not connected" macro: RADIOLIB_NC. That's actually used in the default Rf switch table:

https://github.com/jgromes/RadioLib/blob/ddcce424c8d3fc8d54c00ec4cfabd5fd53e933a0/src/Module.cpp#L492-L506

Regarding the join request, I think this is the issue:

"txInfo": {
"frequency": 925100000,
"power": 20,
"modulation": "LORA",
"loRaModulationInfo": {
"bandwidth": 500,
"spreadingFactor": 10,
"codeRate": "4/5",
"polarizationInversion": true
},

If I'm reading that correctly, it seems like your gateway/application server has decided to send the join accept reply at 925.1 MHz, which is downlink channel 3. However, the node used uplink at 905.1 MHz, channel number 14. So the downlink channel should have been 14 % 8 = 6, not 3.

davidfobar commented 1 year ago

I enabled debug printing and see the following:

///////////////////////////////////////////////////////////////////////////

[SX1278] Initializing ... GPIO pre-transfer timeout, is it connected? GPIO pre-transfer timeout, is it connected? GPIO pre-transfer timeout, is it connected?

RadioLib Debug Info Version: 6.2.0.0 Platform: Arduino STM32 (official) Compiled: Oct 13 2023 16:05:23

Found SX126x: RADIOLIB_SX126X_REG_VERSION_STRING: 0000320 53 58 31 32 36 31 20 54 4b 46 20 31 41 31 30 00 | SX1261 TKF 1A10.

M SX126x success! [LoRaWAN] Attempting over-the-air activation ... Channel frequency UL = MHz Timeout in 556032 us

//////////////////////////////////////////////////////////////////////////////////////////////////

I cannot find where "GPIO pre-transfer timeout, is it connected?" is printed from, but it seems like that may be an initialization thing since it eventually is cleared. What I also cannot figure out is why the channels and frequencies are not printing, my thought is that maybe the RX channel is not being properly set? Any thoughts on where I can add a debug statement to investigate?

jgromes commented 1 year ago

The GPIO timeouts are checked here:

https://github.com/jgromes/RadioLib/blob/ddcce424c8d3fc8d54c00ec4cfabd5fd53e933a0/src/Module.cpp#L311-L330

It's very strange, since that message usually means the user provided incorrect BUSY pin number or there's a wiring issue. On STM32WL, this should be handled by the SX126x peripheral.

The missing frequency in debug is most likely caused by the STM32 Arduino core not supporting printf for floats (I got so used to ESP32 doing that I completely forgot other platforms might not be able to do so). Anyway I updated the debug printing to fix this, could you try again?

Another strange thing is that after the uplink, the debug should print downlink frequency, if that doesn't happen then the program never reached that point at all.

davidfobar commented 1 year ago

The printing of floats now works //////////////////////////////////////////////////////////////// Channel frequency UL = 904.500 MHz Timeout in 556032 us Channel frequency DL = 925.100 MHz failed, code -6 //////////////////////////////////////////////////////////////// My only guess is that the IRQ is not being correctly implemented for this chip.Is that a HAL thing?

I added some timestamps (ms) and debug messages to the beginOTAA function, here is the result of single attempt: 0: [LoRaWAN] Starting OTAA join procedure... 2: Setting uplink/downlink frequencies and datarates... Channel frequency UL = 914.900 MHz 74: Configuring devNonce... 123: Building join request message... 128: Sending join request... Timeout in 556032 us 510: Configuring Downlink channel... Channel frequency DL = 927.500 MHz 521: Starting receive... 530: Waiting for join accept... 8522: Join accept timeout! failed, code -6

////////////////////////////////////////////////////////// The code for reference: `int16_t LoRaWANNode::beginOTAA(uint64_t joinEUI, uint64_t devEUI, uint8_t nwkKey, uint8_t appKey, bool force) { // check if we actually need to send the join request Module* mod = this->phyLayer->getMod(); if(!force && (mod->hal->getPersistentParameter(RADIOLIB_PERSISTENT_PARAM_LORAWAN_MAGIC_ID) == RADIOLIB_LORAWAN_MAGIC)) { // the device has joined already, we can just pull the data from persistent storage return(this->begin()); }

//start a timer to inlcude with the debug print statments uint32_t debugstart = mod->hal->millis();

//print the timestamp and the action RADIOLIB_DEBUG_PRINTLN("%lu: [LoRaWAN] Starting OTAA join procedure...", mod->hal->millis() - debugstart); // set the physical layer configuration int16_t state = this->setPhyProperties(); RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Setting uplink/downlink frequencies and datarates...", mod->hal->millis() - debugstart); // setup uplink/downlink frequencies and datarates state = this->setupChannels(); RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Configuring devNonce...", mod->hal->millis() - debugstart); // get dev nonce from persistent storage and increment it uint16_t devNonce = mod->hal->getPersistentParameter(RADIOLIB_PERSISTENT_PARAM_LORAWAN_DEV_NONCE_ID); mod->hal->setPersistentParameter(RADIOLIB_PERSISTENT_PARAM_LORAWAN_DEV_NONCE_ID, devNonce + 1);

// build the join-request message uint8_t joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_LEN];

RADIOLIB_DEBUG_PRINTLN("%lu: Building join request message...", mod->hal->millis() - debugstart);
// set the packet fields joinRequestMsg[0] = RADIOLIB_LORAWAN_MHDR_MTYPE_JOIN_REQUEST | RADIOLIB_LORAWAN_MHDR_MAJOR_R1; LoRaWANNode::hton(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_JOIN_EUI_POS], joinEUI); LoRaWANNode::hton(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_DEV_EUI_POS], devEUI); LoRaWANNode::hton(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_DEV_NONCE_POS], devNonce);

// add the authentication code uint32_t mic = this->generateMIC(joinRequestMsg, RADIOLIB_LORAWAN_JOIN_REQUEST_LEN - sizeof(uint32_t), nwkKey); LoRaWANNode::hton(&joinRequestMsg[RADIOLIB_LORAWAN_JOIN_REQUEST_LEN - sizeof(uint32_t)], mic);

RADIOLIB_DEBUG_PRINTLN("%lu: Sending join request...", mod->hal->millis() - debugstart); // send it state = this->phyLayer->transmit(joinRequestMsg, RADIOLIB_LORAWAN_JOIN_REQUEST_LEN); RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Configuring Downlink channel...", mod->hal->millis() - debugstart); // configure for downlink with default configuration state = this->configureChannel(RADIOLIB_LORAWAN_CHANNEL_DIR_DOWNLINK); RADIOLIB_ASSERT(state);

// set the function that will be called when the reply is received this->phyLayer->setPacketReceivedAction(LoRaWANNodeOnDownlink);

// downlink messages are sent with inverted IQ // TODO use downlink() for this if(!this->FSK) { state = this->phyLayer->invertIQ(true); RADIOLIB_ASSERT(state); }

RADIOLIB_DEBUG_PRINTLN("%lu: Starting receive...", mod->hal->millis() - debugstart); // start receiving uint32_t start = mod->hal->millis(); downlinkReceived = false; state = this->phyLayer->startReceive(); RADIOLIB_ASSERT(state);

RADIOLIB_DEBUG_PRINTLN("%lu: Waiting for join accept...", mod->hal->millis() - debugstart); // wait for the reply or timeout while(!downlinkReceived) { if(mod->hal->millis() - start >= RADIOLIB_LORAWAN_JOIN_ACCEPT_DELAY_2_MS + 2000) { RADIOLIB_DEBUG_PRINTLN("%lu: Join accept timeout!", mod->hal->millis() - debugstart); downlinkReceived = false; if(!this->FSK) { this->phyLayer->invertIQ(false); } return(RADIOLIB_ERR_RX_TIMEOUT); } }`

davidfobar commented 1 year ago

I found the issue!!!

// set the function that will be called when the reply is received this->phyLayer->setPacketReceivedAction(LoRaWANNodeOnDownlink); STM32WLx mod2 = (STM32WLx)this->phyLayer; mod2->setDio1Action(LoRaWANNodeOnDownlink);

the phyLayer was calling the SX126x.h setDio1Action rather than the STM32WLx.h one.

jgromes commented 1 year ago

@davidfobar you're right, for STM32WLx the interrupt will indeed not work. The problem was that the methods like setPacketReceivedAction did not exist for STM32WL, so it defaulted to those in its superclass, which is SX126x. I pushed a fix addressing this, could you check it solved the issue?

davidfobar commented 1 year ago

I confirmed that the STM32WLx.h changes are good. I have a new issue:

Channel frequency DL = 925.700 MHz downlinkMsg: 0000000 49 00 00 00 00 01 86 78 f3 00 01 00 00 00 00 1d | I......x........ 0000010 40 86 78 f3 00 02 01 00 3a be 0a d4 e0 52 23 57 | @.x.....:....R#W 0000020 5e 5b b7 31 6d 8a 5d 30 1f 13 c2 ba 41 38 39 a2 | ^[.1m.]0....A89. 0000030 57 | W
MIC mismatch, expected 19776fd5, got 57a23938 failed, code -7

/////////////////////////////////////////////////////////////////////////////////////// From the Chirpstack side, this is what is being sent: image

gpabdo commented 1 year ago

How do you enable DEBUG logging?

davidfobar commented 1 year ago

@gpabdo Throw this near the top of BuildOpt.h

define RADIOLIB_DEBUG

I've then added my own debug statements with RADIOLIB_DEBUG_PRINTLN("... %d", val);

gpabdo commented 1 year ago

@gpabdo Throw this near the top of BuildOpt.h

define RADIOLIB_DEBUG

I've then added my own debug statements with RADIOLIB_DEBUG_PRINTLN("... %d", val);

Thanks so much!

davidfobar commented 1 year ago

To help further my MIC mismatch issue, this is the key reported by both the server and end node: Key: 03 ba e1 46 1f cb b2 a1 57 97 c4 26 fb 51 5e 23

I don't understand why there is a downlink to begin with, chirpstack is reporting the hello world message is unconfirmed, and I am not attempting to send anything either.

jgromes commented 1 year ago

@davidfobar when exactly is this issue appearing? Is it during the join procedure or afterwards?

The downlink shown in the screenshot seems to be empty apart from a RekeyInd command. But the format of that command is invalid, as it sets the minor version to 0, which is reserved by the LoRaWAN specification.

davidfobar commented 1 year ago

it is only after joining is complete. I have the end node looping and sending incremental hello world messages and then looking for a downlink message. I can get the uplink message out of chirpstack without issue. I am now trying to make sure that I can send messages back to the endnode, but I saw this issue first.

jgromes commented 1 year ago

Which LoraWAN version is chirpstack configured for? It's sending a RekeyInd MAC command, so I would guess it's 1.1, but you didn't specify it.

davidfobar commented 1 year ago

I have the device profile configured for 1.1, other options are available, but of course this was the only configuration that would all RadioLIB to connect without a MIC issue using the end node example.

jgromes commented 1 year ago

I have the device profile configured for 1.1

Then it is rather strange that chirpstack sends RekeyInd with revision set to 0. MIC calculation changed quite a lot across different LoRaWAN versions, so if there is some mismatch there it could cause this issue. I'm confident it's correct on RadioLib end, since I've tested extensively against TTN on v1.1.

I don't understand why there is a downlink to begin with

It's just the MAC command being sent from server to the node without any user data, that's not uncommon.

davidfobar commented 1 year ago

Updating to chirpstack v4 resolved the RekeyInd MAC command. Thank you for adding support for the STM32wle5 interrupts.

I will continue my discussion with bad downlink MICs in a new issue.