mcci-catena / arduino-lmic

LoraWAN-MAC-in-C library, adapted to run under the Arduino environment
https://forum.mcci.io/c/device-software/arduino-lmic/
MIT License
636 stars 207 forks source link

downlink wrong bytes (seems a memory allocation problem) #866

Closed miqmago closed 2 years ago

miqmago commented 2 years ago

I'm using chirpstack to enqueue downlinks to ttgo lora device. I've tried with 3 different gateways from RAK (raspberry PI, rak edge lite and rak edge lite2) and different MCU's (all of them ttgo-lora32-v1).

On small payload packets, 1 byte, all works fine and the packet is correctly received, decoded and verified.

When packet payload sizes are >= 2 bytes then the device starts to fail in receiving the packets. I've been trying with different packet sizes. The packet received on the gateway seems ok, I've been reading the packets from mqtt broker. I couldn't find a way to log the real bytes that the gateway is sending to the device, but all other communications (join request, accept, uplink, etc...) are fine.

To me the problem seems seems a memory allocation problem: it seems that the last bytes from the LMIC.frame are wrong usually starting from byte 27. In small packets it makes MIC to be wrong, but in large packets data is also wrong:

OK CS->GW: 60b11ad20195460003310700010c08cea0e0e1 (Data: 00, AA==)
   PACKET: 60b11ad20195460003310700010c08cea0e0e1
KO CS->GW: 609c0b470185030003320700010d4fce036060 47 (Data: 0000, AAA=)
   PACKET: 609c0b470185030003320700010d4fce036060 57
KO CS->GW: 609c0b470185040003320700010da02e8ea28c 8e (Data: 0001, AAE=)
   PACKET: 609c0b470185040003320700010da02e8ea28c 07
KO CS->GW: 609c0b470185020003310700010d80b84c0f13 8499 (Data: 040807, BAgH)
   PACKET: 609c0b470185020003310700010d80b84c0f13 8d9b
OK CS->GW: 60b11ad20185490003310700010d907a33128863ba9c (Data: 04080700, BAgHAA==)
   PACKET: 60b11ad20185490003310700010d907a33128863ba9c
KO CS->GW: 60b11ad201854b0003320700010d512fdb4be743 d10f (Data: 04080700, BAgHAA==)
   PACKET: 60b11ad201854b0003320700010d512fdb4be743 400f
KO CS->GW: 60b11ad20185470003320700010d3730b7cddf e9c83c54 (Data: 0408070000, BAgHAAA=)
   PACKET: 60b11ad20185470003320700010d3730b7cddf 69c83cc4
KO CS->GW: 60b11ad20185480003320700010dda968eb891 117106260dbe20f8cb70 (Data: 7b226869223a747275657d, eyJoaSI6dHJ1ZX0=)
   PACKET: 60b11ad20185480003320700010dda968eb892 517106760d1e60f8cb23 (Data: 7B226869217A74722565DD)

I've tried with different distances, most of the trials with 0.5m distance and no obstacles in line of sight.

204610: RXMODE_SINGLE, freq=868500000, SF=12, BW=125, CR=4/5, IH=0
radio_irq_handler_v2: LoRa: 64

Environment

To Reproduce

device configuration .pio/libdeps/latest-dev/MCCI LoRaWAN LMIC library/project_config/lmic_project_config.h:

// project-specific definitions
#define CFG_eu868 1
#define CFG_sx1276_radio 1

#define LMIC_MAX_FRAME_LENGTH 32 // I've tried also with different sizes: 64, 96, 128, 255

#define LMIC_DEBUG_LEVEL 2
#define LMIC_ENABLE_event_logging 1        /* PARAM */

The power supply of the device is completely cut off on each cycle, as it has an external power management system. The keys are stored on EEPROM on join success and recovered from memory on each cycle. I've also tried to set adr to false and force SF 7 but it always ends up with SF 12.:

RTC_DATA_ATTR bool adrMode = false; // true;
RTC_DATA_ATTR uint8_t dataRate = DR_SF7; // DR_SF12;
RTC_DATA_ATTR s1_t txPower = 14; 

void lora_loadKeys() {
    uint8_t timestampArr[4];
    Memory::readSlot(MS_LAST_JOIN_TS, timestampArr);
    int32_t lastJoinTs = byteArrLEToNumber<int32_t>(timestampArr);
    int32_t currentTs = getTimestamp(); // from external RTC
    long now = millis() / 1000;

    // abs(currentTs - now) <= 60                       => acabamos de ponerle la bateria
    // lastJoinTs >= 0xfff0                             => no se ha guardado nunca el lastJoin
    // (currentTs - lastJoinTs) >= LORA_JOIN_EXPIRE_S   => la sesión ha expirado
    if (abs(currentTs - now) > 60L && lastJoinTs < 0xfffffff0 && abs(currentTs - lastJoinTs) < LORA_JOIN_EXPIRE_S) {
        // Load params from flash
        uint8_t netid[4];
        uint8_t devAddr[4];
        uint8_t nwkSKey[16];
        uint8_t appSKey[16];
        // uint8_t fUp[2];
        // uint8_t fDown[2];
        Memory::readSlot(MS_NET_ID, netid);
        Memory::readSlot(MS_DEV_ADR, devAddr);
        Memory::readSlot(MS_NW_SKEY, nwkSKey);
        Memory::readSlot(MS_APP_SKEY, appSKey);
        // Memory::readSlot(MS_FU_COUNT, fUp);
        // Memory::readSlot(MS_FD_COUNT, fDown);
        // TODO: we need memory rotation so we don't crush EEPROM: https://github.com/xoseperez/eeprom_rotate
        // LMIC.seqnoUp = Memory::byteArrLEToNumber<uint16_t>(fUp);
        // LMIC.seqnoDn = Memory::byteArrLEToNumber<uint16_t>(fDown);
        LMIC_setSession(byteArrLEToNumber<u4_t>(netid), byteArrLEToNumber<devaddr_t>(devAddr), nwkSKey, appSKey);

        LMIC_setLinkCheckMode(false);
    } // else expired session: join again
}

esp_err_t lora_begin() {
    // setup LMIC stack
    os_init_ex(&myPinmap);  // initialize lmic run-time environment

    // register a callback for downlink messages and lmic events.
    // We aren't trying to write reentrant code, so pUserData is NULL.
    // LMIC_reset() doesn't affect callbacks, so we can do this first.
    LMIC_registerRxMessageCb((lmic_rxmessage_cb_t *)&lora_onRxCompleted, NULL);
    LMIC_registerEventCb((lmic_event_cb_t *)&lora_onEvent, NULL);

    // clear pending TX.
    LMIC_clrTxData();
    // Reset the MAC state. Session and pending data transfers will be discarded.
    LMIC_reset();

    // This tells LMIC to make the receive windows bigger, in case your clock is
    // faster or slower. This causes the transceiver to be earlier switched on,
    // so consuming more power. You may sharpen (reduce) CLOCK_ERROR_PERCENTAGE
    // in src/lmic_config.h if you are limited on battery.
    // TODO: fer-ho configurable per downlink i per AP
#ifdef CLOCK_ERROR_PROCENTAGE
    LMIC_setClockError(CLOCK_ERROR_PROCENTAGE * MAX_CLOCK_ERROR / 1000);
#endif

    if (!LMIC.devaddr && isLoraFlashSession()) {
        lora_loadKeys();
    }

    // start lmic loop task
    isLoraInitialized = true;
    xTaskCreatePinnedToCore((TaskFunction_t)&lora_lmictask,  // task function
                            "lmictask",                      // name of task
                            4096,                            // stack size of task
                            (void *)1,                       // parameter of the task
                            2,                               // priority of the task
                            &lmicTask,                       // task handle
                            1);                              // CPU core

    // start lora send task
    xTaskCreatePinnedToCore((TaskFunction_t)&lora_send,  // task function
                            "lorasendtask",              // name of task
                            3072,                        // stack size of task
                            (void *)1,                   // parameter of the task
                            1,                           // priority of the task
                            &lorasendTask,               // task handle
                            1);                          // CPU core

    return ESP_OK;
}

/**
 * Only called after EV_JOINED
 */
void lora_setupForNetwork() {
#if CFG_LMIC_EU_like
    // Enable link check validation
    LMIC_setLinkCheckMode(true);
#endif
    // set data rate adaptation according to saved setting
    LMIC_setAdrMode(adrMode);
    // Set data rate and transmit power for uplink to stored device values if no ADR (note: txpow seems to be ignored by the library)
    if (!adrMode) {
        LMIC_setDrTxpow(dataRate, txPower);
    }

    // Set max clock error
    // https://github.com/matthijskooijman/arduino-lmic#problems-with-downlink-and-otaa
    LMIC_setClockError(MAX_CLOCK_ERROR * 1 / 100);
}
wolfpcgn commented 2 years ago

0,5m distance between node and gateway is very much to close. The receiver of the node/gateway may be overloaded. Try 10m or a wall in between. If that doesn't work look for the memory overflow.

wolfpcgn commented 2 years ago

" I've also tried to set adr to false and force SF 7 but it always ends up with SF 12.:"

I observed this behaviour too with my T-Beams. I can't explain why this happens. There seems to be a MAC-command from TTS that switches the node to SF12.

miqmago commented 2 years ago

Thanks @wolfpcgn! I've made some more experiments, moving away to different distances and the result is the same, having payloads with some wrong bytes (1 to 5 wrong bytes) at random positions. I've also tried with different MCU and some of them always fails and some of them never fails.

I'm starting to think it could be a hardware issue with some boards. Just to discard memory overflow, any idea on how could I check it? Right now I'm printing the received frame inside decodeFrame(), I've placed debug printing code there:

    LMIC_DEBUG_PRINTF("Frame: ");
    for (i = 0; i < LMIC.dataLen; i += 1) {
        LMIC_DEBUG_PRINTF("%02x", LMIC.frame[i]);
    }
    LMIC_DEBUG_PRINTF("\n");

I've been always sending the same data 0408070000:

~3m
                                     v v  v
60371b6e0185040003310700010de7d4b0feceacc2fd64
60371b6e0185040003310700010de7d4b0fecda0c2ed64 YDcbbgGFBAADMQcAAQ3n1LD+zaDC7WQ=

~10m (1 floor)
                                       vv
60371b6e0195070003320700010da49d7c151ffe6f68cf
60371b6e0195070003320700010da49d7c151ff44f68cf YDcbbgGVBwADMgcAAQ2knXwVH/RPaM8=

                                      v
60371b6e0195080003310700010d55000a0c16ab606fa4
60371b6e0195080003310700010d55000a0c162b606fa4 YDcbbgGVCAADMQcAAQ1VAAoMFitgb6Q=

~15m (2 floors)
                                    v    v
60371b6e0195090003310700010d1bd27ee641e2782492
60371b6e0195090003310700010d1bd27ee611e2702492 YDcbbgGVCQADMQcAAQ0b0n7mEeJwJJI=

                             v      v vv    v
60371b6e01950a0003310700010d49d6763d335eefb14e
60371b6e01950a0003310700010d45d2763d63ddefb17e YDcbbgGVCgADMQcAAQ1F0nY9Y93vsX4=

~20-30m (outside building)
                                     vvv    v
60371b6e01950b0003300700010d8e261edb97fcad180c
60371b6e01950b0003300700010d8e261edb94b0ad189c YDcbbgGVCwADMAcAAQ2OJh7blLCtGJw=

                                    vvv  v
60371b6e01950c0003300700010d680e127e45dc0a4384
60371b6e01950c0003300700010d680e127e065c084384 YDcbbgGVDAADMAcAAQ1oDhJ+BlwIQ4Q=

                                    v v
60371b6e01950d0003300700010dc23b2b97c14d87991c
60371b6e01950d0003300700010dc23b2b97a1cd87991c YDcbbgGVDQADMAcAAQ3COyuXoc2HmRw=

0.5m Different antenna
                                      vv v
60371b6e01950e0003320700010dc6f649e54ef16b8fa8
60371b6e01950e0003320700010dc6f649e54ed4618fa8 YDcbbgGVDgADMgcAAQ3G9knlTtRhj6g=