NordicPlayground / nRF51-ble-bcast-mesh

Other
324 stars 121 forks source link

Increasing CACHE_TASK_FIFO_SIZE > 15 causes fault #74

Closed victorpasse closed 8 years ago

victorpasse commented 8 years ago

I am bursting multiple commands at once and therefore need to increase the chache.

stack trace is 12 app_error_handler() main.c:122 0x000193e2
11 mesh_packet_ref_count_inc() mesh_packet.c:101 0x0002374c 10 tc_tx() transport_control.c:302 0x000249d4
9 transmit_all_instances() version_handler.c:360 0x0002534e 8 async_event_execute() event_handler.c:62 0x00023366
7 event_fifo_pop() event_handler.c:85 0x000233e4
6 QDEC_IRQHandler() event_handler.c:106 0x000233e4

g_packet_refs[index] == 0x00 is true

Other buffer sizes

/* @brief Default value for the number of handle cache entries /

ifndef RBC_MESH_HANDLE_CACHE_ENTRIES

#define RBC_MESH_HANDLE_CACHE_ENTRIES           (256)

endif

/* @brief Default value for the number of data cache entries /

ifndef RBC_MESH_DATA_CACHE_ENTRIES

#define RBC_MESH_DATA_CACHE_ENTRIES             (40)

endif

/* @brief Length of app-event FIFO. Must be power of two. /

ifndef RBC_MESH_APP_EVENT_QUEUE_LENGTH

#define RBC_MESH_APP_EVENT_QUEUE_LENGTH         (16)

endif

/* @brief Length of low level radio event FIFO. Must be power of two. /

ifndef RBC_MESH_RADIO_QUEUE_LENGTH

#define RBC_MESH_RADIO_QUEUE_LENGTH             (16)

endif

/* @brief Length of internal async-event FIFO. Must be power of two. /

ifndef RBC_MESH_INTERNAL_EVENT_QUEUE_LENGTH

#define RBC_MESH_INTERNAL_EVENT_QUEUE_LENGTH    (16
trond-snekvik commented 8 years ago

Thanks for the detailed trace.

I think this is down to a bug in version handler. In short, the 0.8.1 version of the mesh introduced "cache-tasks" for the version handler - basically serializing cache access. A cache task is enqueued in the cache_task_push() function. We've identified a potential hole in the usage of this function: If the cache task is successfully pushed into its fifo (L491), but the async-task isn't (L502), we return an error code. Some of the calling functions treat this error as if the packet is lost, although this isn't true - it's queued into the cache-tasks fifo! This causes a 0-ref packet to exist in the cache-task-fifo, and we get this error.

So, the fix: Replace the cache_task_push function with this:

static uint32_t cache_task_push(cache_task_t* p_task)
{
    uint32_t error_code = fifo_push(&m_task_fifo, p_task);
    if (error_code != NRF_SUCCESS)
    {
        return error_code;
    }

    async_event_t evt;
    evt.type = EVENT_TYPE_GENERIC;
    evt.callback.generic = handle_task_queue;
    event_handler_push(&evt); /* don't care about the return code */

    return NRF_SUCCESS;
}

Please let me know if this fixes your issue :)

victorpasse commented 8 years ago

This solved the problem, thanks. Now I have CACHE_TASK_FIFO_SIZE at 64 with no problem.

olsky commented 8 years ago

:+1: