TheThingsIndustries / generic-node-se

Generic Node Sensor Edition
https://www.genericnode.com
Other
111 stars 32 forks source link

Fix freertos_lorawan downlink #187

Open elsalahy opened 3 years ago

elsalahy commented 3 years ago

Summary:

The downlink handling of freertos_lorawan app is broken.

Steps to Reproduce:

  1. Flash freertos_lorawan app
  2. Send a downlink after OTAA
  3. Observe logs and node output

What do you see now?

The received downlinks is not passed correctly from the mac layer to the application layer

What do you want to see instead?

A functional application that is able to receive downlinks

How do you propose to implement this?

Environment:

FreeRTOS

What can you do yourself and what do you need help with?

ALL

elsalahy commented 3 years ago

@marnixcro I think this issue is a good start for you on the FreeRTOS side, can you undertake this one?

elsalahy commented 3 years ago

This is a medium priority

mcserved commented 3 years ago

The issue seems to be a bit easier than queue allocation. I was able to have downlinks working by simply multiplying the wait time of the downlink task receive window (and by having a breakpoint during debugging which lenghtened the downlink time). I'll try to find if improper timing was indeed the cause and, if so, why this occurs. For reference, the dirty fix:

BaseType_t LoRaWAN_Receive( LoRaWANMessage_t * pMessage,
                            uint32_t timeoutMS )
{
    TickType_t ticksToWait;

    if( timeoutMS > 0 )
    {
        ticksToWait = pdMS_TO_TICKS( timeoutMS );
    }
    else
    {
        ticksToWait = 1;
    }
    ticksToWait *= 2;

    return xQueueReceive( xDownlinkQueue, pMessage, ticksToWait );
}

Unrelated but may be good to know: downlinks can be retrieved during the join procedure and the power consumption shows that the windows were opened (as the device would temporarily consume more twice after transmitting). So downlink does seem to 'run' even if it didn't function properly.

elsalahy commented 3 years ago

@marnixcro great finding, feel free to open a PR with the fix and glad it's a simple fix

elsalahy commented 3 years ago

What is the status on this?

mcserved commented 3 years ago

What is the status on this?

The dirty fix seemed to have fixed it differently than I thought initially. At first I thought that editing this line would be sufficient: https://github.com/TheThingsIndustries/generic-node-se/blob/0f195aca7273a3f53c929ed4cd37668815adade1/Software/app/freertos_lorawan/conf/lorawan_conf.h#L55 But that did not solve the issue. I debugged it further and it seems that it always jumps into RX timeout (instaed of RX done, or RX error) so I would need to look into why these are triggered occasionally (as it seems that even the regular configuration will sometimes work normally for some time).

mcserved commented 3 years ago

Rx windows are always timed out, a test should be made if the correct timing is reached in the freertos, this can be compared to the timing on the basic_lorawan application to see if the downlink receive time is off.

elsalahy commented 3 years ago

@marnixcro yes indeed, as we discussed, we can adjust the window offsets and timings to ensure the FreeRTOS layer downlink handling is correct.

We need: 1- One power analysis snapshot of bare-metal uplink and downlink 2- One snapshot of power analysis of freertos uplink and downlink (missing a downlink would be preferable)

we can align theses snapshots and see if we can find the root cause of the issue

mcserved commented 3 years ago
I made a graph showing the downlink retrieval times (no downlink was set, but the two retrieval times where added): basic_lorawan freertos_lorawan Delta
5s14ms 5s129ms +115ms
218ms 271ms +53ms
781ms 754ms -27ms
218ms 271ms +53ms

Both were done with the ADR off and at DR_0

This picture also shows that the downlinks retrieval times do touch briefly, but the freertos application's downlink starts and ends later image

mcserved commented 3 years ago

Blocked by #203

elsalahy commented 3 years ago

Moved this to Q3

NicolasMrad commented 2 years ago

is this still relevant?