mcci-catena / arduino-lmic

LoraWAN-MAC-in-C library, adapted to run under the Arduino environment
https://forum.mcci.io/c/device-software/arduino-lmic/
MIT License
636 stars 207 forks source link

Lockup every 11h or fcnt 65 #642

Open jpmeijers opened 3 years ago

jpmeijers commented 3 years ago

This is related to https://github.com/mcci-catena/arduino-lmic/issues/547, but that issue used an older version of this library.

image

Current setup

Transmit function

void loop()
{
<snip>
      Serial.println("Should be every 10 minutes");
      tx_blocking(radioPacket, sizeof(radioPacket));
<snip>
}

void tx_blocking(uint8_t* radioPacket, uint8_t packetLength)
{
  do_send(&sendjob, radioPacket, packetLength);

  while( LMIC.opmode & OP_JOINING ) {
    os_runloop_once();
    delay(10);
  }
  while( LMIC.opmode & OP_TXDATA )
  {
    os_runloop_once();
    delay(10);
  }
  Serial.println("TX done");
}

void do_send(osjob_t* j, uint8_t* mydata, uint8_t packetLength){
    if (LMIC.opmode & OP_TXRXPEND) {
        Serial.println(F("OP_TXRXPEND, not sending"));
    } else {
        LMIC_setTxData2(1, mydata, packetLength, 0);
        Serial.println(F("Packet queued"));
    }
}

Current behaviour

Frame counter indicates board restart. image

Log output

<snip>
Loops = 4799
38592828ms

Loops = 4800
38600833ms
Should be every 10 minutes
-1765828511: Unknown event: 17
Packet queued
-1765756390: EV_TXCOMPLETE (includes waiting for RX windows)
TX done

Loops = 4801
38610471ms

<snip>

Loops = 4874
39194821ms

Loops = 4875
39202825ms
Should be every 10 minutes
-1726382203: Unknown event: 17
Packet queued

<SYSTEM STARTUP>
Packet queued
101312: EV_JOINING
136090: Unknown event: 17
459509: EV_JOINED
460633: Unknown event: 17
594642: EV_TXCOMPLETE (includes waiting for RX windows)
TX done
Loops = 1
17525ms

Loops = 2
25527ms
<snip>

Interpretation of logfile

Second last TX is successful as I see the EV_TXCOMPLETE event and the TX done log message.

75 loops of 8s sleeps later (10 minutes) I see a message Packet queued. After this the blocking TX function wait forever until the WDT resets the device and the device starts up and join again.

Something strange that I see is that an Unknown event: 17 is printed every time I do LMIC_setTxData2().

I have seen a similar issue with Basic Mac https://github.com/LacunaSpace/basicmac/issues/25

jpmeijers commented 3 years ago

Changes for next experiment:

    while(os_queryTimeCriticalJobs(ms2osticks(8000))) {
      Serial.println("Crytical jobs");
      radio_process();
      delay(10);
    }

I'll report back any changes after it has run for 12h+ hours.

terrillmoore commented 3 years ago

Sorry you're having problems!

My suggestions.

Best regards, --Terry

jpmeijers commented 3 years ago

Thanks Terry, I'll try and follow your suggestions.

With the changes listed in https://github.com/mcci-catena/arduino-lmic/issues/642#issuecomment-743742205 the lockup does not happen anymore. But an interesting result is that fcnt = 65 is skipped. I received 64 and 66 though. It could be that 64 was still queued when 65 was ready, and the new if check in do_send detected this.

image