FreeRTOS / FreeRTOS-Kernel

FreeRTOS kernel files only, submoduled into https://github.com/FreeRTOS/FreeRTOS and various other repos.
https://www.FreeRTOS.org
MIT License
2.74k stars 1.12k forks source link

[BUG] Wrong critical section handling before the scheduler is started #254

Closed stefano-zanotti closed 3 years ago

stefano-zanotti commented 3 years ago

Describe the bug Before the scheduler has started, any call to FreeRTOS functions that require a critical section will mess up the interrupt enabling status. This includes calls made by vTaskStartScheduler() itself, e.g. to create the idle task.

Target

Host

To Reproduce

Additional context In my init code, I first create a task (and possibly queues, mutexes, etc.), then start the scheduler. The problem is that

I'm using FreeRTOS Kernel V10.3.1, with the GCC ARM_CM7 port (though at a first glance it seems that FreeRTOS Kernel V10.4.3 behaves the same). This is where uxCriticalNesting is defined: https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM7/r0p1/port.c#L140 static UBaseType_t uxCriticalNesting = 0xaaaaaaaa;

These are the definitions of (the port-specific versions of) taskENTER_CRITICAL() and taskEXIT_CRITICAL() https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM7/r0p1/port.c#L396 https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM7/r0p1/port.c#L413 void vPortEnterCritical( void ) { portDISABLE_INTERRUPTS(); uxCriticalNesting++; ... } void vPortExitCritical( void ) { configASSERT( uxCriticalNesting ); uxCriticalNesting--;

    if( uxCriticalNesting == 0 )
    {
        portENABLE_INTERRUPTS();
    }
}

Note that "disabling interrupts" doesn't mean globally masking them (using the "cpsie i" and "cpsid i" ARM instructions); rather, it means altering BASEPRI (which only masks interrupts with priority no higher than that of FreeRTOS). See here: https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM7/r0p1/portmacro.h#L100

Here is a summary of what is happening. I'm listing function calls and variable values.

// static init: uxCriticalNesting = 0xaaaaaaaa, ints enabled xTaskCreateStatic() prvAddNewTaskToReadyList() taskENTER_CRITICAL() // uxCriticalNesting = 0xaaaaaaab, ints disabled taskEXIT_CRITICAL() // uxCriticalNesting = 0xaaaaaaaa, ints disabled vTaskStartScheduler() xTaskCreateStatic() // for the idle task prvAddNewTaskToReadyList() taskENTER_CRITICAL() // uxCriticalNesting = 0xaaaaaaab, ints disabled taskEXIT_CRITICAL() // uxCriticalNesting = 0xaaaaaaaa, ints disabled xTimerCreateTimerTask() prvCheckForValidListAndQueue() taskENTER_CRITICAL() // uxCriticalNesting = 0xaaaaaaab, ints disabled taskEXIT_CRITICAL() // uxCriticalNesting = 0xaaaaaaaa, ints disabled prvInitialiseNewQueue() xQueueGenericReset() taskENTER_CRITICAL() // uxCriticalNesting = 0xaaaaaaab, ints disabled taskEXIT_CRITICAL() // uxCriticalNesting = 0xaaaaaaaa, ints disabled xTaskCreateStatic() prvAddNewTaskToReadyList() taskENTER_CRITICAL() // uxCriticalNesting = 0xaaaaaaab, ints disabled taskEXIT_CRITICAL() // uxCriticalNesting = 0xaaaaaaaa, ints disabled // from now on, the behavior seems correct; it actually manages to fix the problem caused by the previous calls portDISABLE_INTERRUPTS(); // this just disables interrupts, without changing uxCriticalNesting; it's done on purpose by vTaskStartScheduler(), see the comment for https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/tasks.c#L2059 xPortStartScheduler() uxCriticalNesting=0 // https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/ARM_CM7/r0p1/port.c#L363 prvPortStartFirstTask() cpsie i svc 0 vPortSVCHandler() msr basepri, r0 // this re-enables interrupts; now we are in a clean working condition: interrupts enabled, uxCriticalNesting = 0 bx r14 // now the first task is actually started

This summary also shows that the problem is not in my creating a task beforehand; it exists even within vTaskStartScheduler(), with the creation of the idle task and the timer task. However, the problem is worse if I create objects beforehand, since interrupts remain disabled for longer.

A straightforward and very lightweight solution would be to just initialize uxCriticalNesting to 0, statically. Is there any reason to have it initialized to 0xaaaaaaaa? It seems that no function ever checks for this "uninitialized" value, and debuggers should have other better ways to detect whether the scheduler has been inited/started yet.

As an aside: is there any reason why there are these assertions in port.c? configASSERT( uxCriticalNesting == 1000UL ); configASSERT( uxCriticalNesting == ~0UL ); rather than a simple: configASSERT( 0 ); test_freertos.zip

RichardBarry commented 3 years ago

Thanks for taking the time to provide this report. In this case however the behaviour you describe is intentional and applies to all FreeRTOS port. In the early days of FreeRTOS one of the most common issues users had was installing an interrupt prior to starting the scheduler, and then for that interrupt to execute and try to use the kernel before the kernel had started. Hence, for many years now, any time you start using the kernel's API interrupts are deliberately kept masked to ensure the code can execute all the way to starting the scheduler without the above happening. As you say, interrupts are re-enabled when the scheduler starts.

We always recommend discussing topics in the support forum and then opening a bug report if the result of the discussion is that a bug exists.

stefano-zanotti commented 3 years ago

Thanks for your explanation. However, I still don't understand it. Isn't it still a problem if an interrupt tries to call FreeRTOS before it has started? The behavior of uxCriticalNesting initialized to 0xaaaaaaaa, is such that the interrupt would call FreeRTOS, interrupts would get disabled, the call would be executed, and then interrupts would be left disabled. The call itself is not prevented.

The solution would be not enabling the interrupts beforehand, or having the interrupts themselves check if the scheduler is running (all of which is the application's duty, not the rtos'). How is servicing one call and then disabling interrupts helping with the problem?

Sorry for posting here instead of the forum; now I know.

RichardBarry commented 3 years ago

The sequence would be like this: