Closed escherstair closed 1 year ago
Hi,
I briefly looked at the example structure and HAL_IncTick is called in startup_stm32h743xx.s where default weak reference to SysTick_Handler was modified and renamed to _SysTick_Handler. If you updated startup files then your updated file does not contain this modification anymore and this could be a reason that HAL_IncTick is never called.
Hi @VladimirUmek I'm not sure I understood what you explained. I see that HAL_IncTick is called in startup_stm32h743xx.s inside _SysTick_Handler. But _SysTick_Handler is never called when I enabled FreeRTOS, and the SysTick interrupt stays in pending state (never serviced).
In my previous response I was just searching for a reason that HAL_IncTick is never called. And if you would update startup_stm32h743xx.s then this could be one option - because with the update, modification in startup that calls HAL_IncTick would be gone. But as you said, you still have this code and SysTick interrupt is apparently never serviced. Is SysTick interrupt enabled? What is its priority? Another option would be that interrupts are masked, check BASEPRI value.
Unfortunately I cannot test the example because I don't have any appropriate board, but I'll investigate further and check what could go wrong.
Thanks to your suggestion I get some additional info. SysTick is enabled and has priority set to 15 But BASEPRI is not what is expected. It's set to 0x50 (which is the value used for the OS critical regions) somewhere in the scatterload process, before FreeRTOS scheduler has been configured. I'm going to investigate, but feel free to share your thoughts.
Hi @VladimirUmek
I was able to see the reason why this happens, and I think this confirmed my doubts on clib_arm.c implementation.
BASEPRI is set to 0x50 during __rt_entry()
which calls _init_alloc
that calls _mutex_initialize
(see below call stack)
It calls explicitly xSemaphoreCreateMutexStatic()
but FreeRTOS has not been created yet.
Following the call stack, xQueueGenericReset()
is called
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/queue.c#L266-L267
and it calls
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/queue.c#L279
setting BASEPRI to 0x50
The problem is that when
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/queue.c#L318
is called, this function is mapped to vPortExitCritical()
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/portable/GCC/ARM_CM7/r0p1/port.c#L415-L424
But variable uxCriticalNesting
has not been set yet and it has the default value of
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/portable/GCC/ARM_CM7/r0p1/port.c#L142
The reason is because it's initilized by
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/Source/portable/GCC/ARM_CM7/r0p1/port.c#L365
when FreeRTOS scheduler is started.
Do you agree to my analysis? Do you think that a fix to clib_arm.c is necessary? Maybe the implementation for RTOS2::RTX5 can be used as a guideline (rtx_lib.c)?
Many thanks again for the detailed analysis! This helps a lot and I understand the problem now.
I don't think this is the problem of implementation in clib_arm.c but I would say that it is the problem of port (initialization) implementation in FreeRTOS: Before you start the kernel you can create threads and other objects like mutexes, semaphores etc. If you would have an application that only contains
int main(void) {
xSemaphoreCreateMutexStatic(...);
xTaskCreate(...);
vTaskStartScheduler();
}
the situation after the call of xSemaphoreCreateMutexStatic would be exactly the same as in your application:
vPortEnterCritical
disables interrupts and uxCriticalNesting
prevents vPortExitCritical
to re-enable them. And function xPortStartScheduler
is the only point where uxCriticalNesting
is set to zero.
This means that any application that creates RTOS objects before the kernel is started has its interrupts blocked (up to BASEPRI).
In case of our application, pre-main calls xSemaphoreCreateMutexStatic
thus disables interrupts and later on hangs in a call to HAL_GetTick
because SysTick handler cannot execute.
You can take a look at examples (Blinky is fine) in Keil.STM32H7xx_DFP for one possible solution to this problem (re-implement HAL_GetTick
and enable it for the case when the kernel is not yet started).
Thanks for your help. Let me summarize with different words what I understood, so that you can confirm if I understood...
HAL_Delay
called before the kernel is started, would hang forever (becasue SysTick handler cannot execute)HAL_GetTick
so that id "emulates" the job of SysTick handler while the kernel has not been started yetAm I right? If this si the case, I think that the best long-term solution would be fixing the FreeRTOS porting. Do you agree? Who is in charge for this porting? CMSIS or FreeRTOS? So that I know where to open an issue
As I said, I think that such behavior is a design choice to keep the kernel state machine consistent until the kernel is started. FreeRTOS maintains the ports but I don't think this is a bug although in some cases it may represent a problem for application designer...
You are correct that the FreeRTOS behaviour is intended. Every port behaves this way, and has done from very early versions of FreeRTOS where a primary cause of support requests were interrupts starting to use the FreeRTOS API before the kernel had been started - a particularly difficult issue to debug. See point 4 "Interrupts are not executing" on this page. A simple work around is to manually clear the primask register when necessary - but then you may need additional code to prevent the FreeRTOS API trying to context switch in the systick before the kernel is running - which will be inefficient as the additional code will run every time the systick executes.
Hello @RichardBarry, first of all thank you very much for the detailed information. The behavior is quite odd, and so it's not easy understanding what happens.
I'm not sure I got your point: as far as I understand, you say that FreeRTOS needs that interrupts don't use the FreeRTOS API before the kernel has been started. Am I right?
If this is the case, I understand that a "quick-and-dirty" workaround has been taken from very early versions of FreeRTOS: disabling interrupts at all until the kernel is started. It seems to me quite a "brutal" wrkaround (even if it's stated at point 4 here):
If a FreeRTOS API function is called before the scheduler has been started then interrupts will deliberately be left disabled, and not re-enable again until the first task starts to execute. This is done to protect the system from crashes caused by interrupts attempting to use FreeRTOS API functions during system initialisation, before the scheduler has been started, and while the scheduler may be in an inconsistent state.
It seems that a normal function (not an interrupt) could use FreeRTOS API functions during system initialisation.
And this is what happens inside __scatterload
Can you confirm that this usage is ok for FreeRTOS?
Then you write:
but then you may need additional code to prevent the FreeRTOS API trying to context switch in the systick before the kernel is running - which will be inefficient as the additional code will run every time the systick executes.
This seems to me exactly what CMSIS-FreeRTOS adoption layer does here
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/CMSIS/RTOS2/FreeRTOS/Source/cmsis_os2.c#L160-L164
It seems that CMSIS-FreeRTOS protects from the risk of "interrupts attempting to use FreeRTOS API functions during system initialisation" inside SysTick_Handler()
and FreeRTOS itself protects with a heavy decision of disabling the interrupts.
And so I get both the downsides:
HAL_Delay()
not working)SysTick_Handler()
Are you sure is not possible handling this risk in a different way?
It seems to me quite a "brutal" wrkaround
Maybe - but very effective. It has probably saved many weeks of frustration for FreeRTOS users that otherwise would experience crashes caused outside of their view of the program execution - which can be tricky to track down.
This is rarely a problem for users who create all their RTOS objects then start the scheduler. Another common usage model is to create a task that performs the initialisation, then spawns the other tasks, so perform initialisation with the scheduler already running.
It seems that a normal function (not an interrupt) could use FreeRTOS API functions during system initialisation. And this is what happens inside __scatterload Can you confirm that this usage is ok for FreeRTOS?
That is not a problem - although I would question why interrupts are used in __scatterload().
This seems to me exactly what CMSIS-FreeRTOS adoption layer does here
Yes it looks like that is the case. I also wrote:
which will be inefficient as the additional code will run every time the systick executes
So in the code you posted every tick interrupt has the overhead of an extra function call and an extra test. Looks like fixing the symptom rather than the cause.
Are you sure is not possible handling this risk in a different way?
In FreeRTOS we try and avoid additional code that checks the system state every time (for example, see https://freertos.org/FAQ_API.html#IQRAPI - there are trade offs made for every decision) - we would welcome any ideas for preventing interrupts using the API before the scheduler is running without needing any additional code that will then also be present for the lifetime of the application.
Thank you a lot @RichardBarry Now the whole situation is much more clear.
I fully agree to the fact that every decision requires some trade offs.
Based on what I learned from this lesson, I've been thinking on if/how something could be improved.
What happens with actual implementation of __scatterload
in the case when microlib is not used (it leaves interrupts globally disabled after having created mutextes to protect libc streams) is that FreeRTOS users experience failures caused outside of their view of the program execution - and this is tricky to track down.
Based on your experience, do you (@VladimirUmek and @RichardBarry) think that __scatterload
could/should re-enable interrupts as its last instruction?
In this way the FreeRTOS user doesn't find an unexpected situation as the starting point, and he's responsible to write code using FreeRTOS as designed.
I'm not sure who is responsible to evaluate this decision. I think someone in ARM.
The only idea to improve FreeRTOS side (if possible) is to inform the caller about the "side-effect" of the function call. I think this is a general best-practise. I mean, if a FreeRTOS function doesn't re-enable interrupts because the scheduler has not been started yet, I think this is a side-effect of the call. And as every side effect, the caller should be somehow informed about it. Is there a "standard" way to inform FreeRTOS callers about side-effects? A way already used, I mean. @RichardBarry feel free to share your ideas on thsi suggestion.
I don't think there is right or wrong answer in this case.
One scenario that could happen in mentioned FTP demo application:
xSemaphoreCreateMutexStatic
interrupts are re-enabled.HAL_Init
configures SysTick and starts it._SysTick_Handler
executes, calls HAL_IncTick
and calls SysTick_Handler
.if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED)
check is removed from line 160, cmsis_os2.c.xPortSysTickHandler
is executedNow, application didn't start the kernel, but the kernel tick is already started and is incrementing. And we can start discussing why the kernel tick is incrementing if the kernel is not even started. Who started SysTick? Who should start SysTick? Why is SysTick used by both STM32Cube HAL and FreeRTOS? Etc, etc...
There is A LOT of existing code that is wrong. That's why interrupts are disabled after RTOS objects get created. That's why there is a check if (xTaskGetSchedulerState() != taskSCHEDULER_NOT_STARTED)
in cmsis_os2.c.
We improve documentation but many developers don't read it. We add additional checking in the code and developers complain about overhead.
And, in mentioned application, which function hangs because HAL_IncTick
is never called? Is it MX_SDMMC1_SD_Init
? If it is, well, this function is not even necessary to be called in order for application to work...
Trade offs...
It is an ugly situation, I agree and can document it, but from my perspective, this is integration issue in application.
Not directly related: I understand why C startup code enables interrupts before calling main(), but I don't understand why that happens inside __scatterload(). As far as I know __scatterload() is responsible for initialising memory.
Yes and the only thing that happens inside __scatterload
is memory initialization.
After memory regions are initialized, control is passed to __rt_entry where standard C library is initialized. C library uses mutexes to protect its internal variables and streams (default heap management + stdio streams). And this is where xSemaphoreCreateMutexStatic
is called. That's it. And soon after that we end up in main
.
Thank you @VladimirUmek. You explained much better than me what happens during C-lib "initialization".
Now everything is clear, but what confuses me at the beginning was that:
main
starts with interrupts enabledmain
starts with interrupts disabled
FreeRTOS is used in both cases.
Now I understand why and I know how to write my code so that it works in both cases (and even if I switch to another CMSIS::RTOs2 implementation, I hope)I agree that reaching main
once with interrupts disabled and once with interrupts enabled is not a good solution. Therefore, I will look if clib_arm.c could re-enable interrupts as Richard suggested. If there is a clean way to add this and there will be no side effects, interrupt will be re-enabled.
We will document this behavior also in CMSIS-FreeRTOS documentation.
I apologize for asking as not part of the original conversation but I have had the same issue, in general. I have no problem with the reasoning of the start up procedure. It was just annoying having to trace down where there might be a an unstable irq or mutex state, especially when running into stack overflows/corruption. This is an old suggestion but I was reminded of it when I was going though some ddk function calls back in the Window 98 days where most functions had blank define annotations for function arguments. As a simple example
#define __AQUIRED
#define __RELEASED
__USED void _mutex_acquire (__RELEASED mutex *m);
__USED void _mutex_release (__AQUIRED mutex *m);
I hate to say, having the tags _HEAP_ONLY and _MUST_FREE has saved me many times in smaller projects where I had to do simple one off changes. With the almost mandatory need for IDE's to support IntelliSense, doing something like a full Doxygen implementation seems overkill. The point of the mark is not to notify somone who knows the deeper implementation, rather, somone having to do a quick stack trace with nothing other than a FUNCTION string.
This is just an unsolicited comment, its just nice to come across an interesting conversation. Especially after 4 hours of debugging a stack corruption error to find the pointer to the heap was swaped.
Sources and documentation were updated, hence closing the issue. Please reopen in case anything needs further attention.
I found a problem while playing a little bit with FTP Server example from Oryx-Embedded on STM32H743I-EVAL board. This example works out-of-the-box using RTOS2 interface, implemented through RTX5. Since the example needs File System from Keil.MDK-Middleware, MicroLIB cannot be used.
If I change RTOS2 implementation from RTX5 to FreeRTOS (downloaded through its own CMSIS pack), the example doesn't work anymore. The most visible issue is that
void HAL_IncTick(void)
is never called and the interrupt stays in pending state forever. Has someone an idea on what could be the reason?For this reason I investigated deeper inside the sources to find the differences between RTX5 and FreeRTOS implementations behind RTOS2 interface.
I found some strange things inside clib_arm.c (used by CMSIS-FreeRTOS) compared to rtx_lib.c (used by RTX5). I summarize what I see:
Point 1 This comes from
rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L623-L630 where I seeosKernelInitialize()
is called. This comes fromclib_arm.c
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/CMSIS/RTOS2/FreeRTOS/Source/ARM/clib_arm.c#L45-L54 where I see that nothing is node (apart from Event Recording, which is NOT defined in the FTP Server example).Point 2 This comes from
rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L690-L693 and thedefine
is different from what I see fromclib_arm.c
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/CMSIS/RTOS2/FreeRTOS/Source/ARM/clib_arm.c#L71 sinceRTX_NO_MULTITHREAD_CLIB
is missing from this one.Point 3 This comes from
rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L701-L707 where I can see that the variables are placed in abss.os.libspace
region. This comes fromclib_arm.c
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/CMSIS/RTOS2/FreeRTOS/Source/ARM/clib_arm.c#L75-L79 where no explicit placement is done.Point 4 Functions from
rtx_lib.c
https://github.com/ARM-software/CMSIS_5/blob/b0c7f1933926c805292b0368eb19e55b6646d18b/CMSIS/RTOS2/RTX/Source/rtx_lib.c#L710-L794 call RTOS2 functions (osKernelGetState()
,osThreadGetId()
, ...) On the other side, functions fromclib_arm.c
https://github.com/ARM-software/CMSIS-FreeRTOS/blob/3da36094e21d6fed61719ecf9bb3530a11fd2bb0/CMSIS/RTOS2/FreeRTOS/Source/ARM/clib_arm.c#L81-L198 call a lot of native FreeRTOS functions (xTaskGetSchedulerState()
,xTaskGetCurrentTaskHandle()
, ...) and I don't understand if the do exactly what is expected from them.Does clib_arm.c needs an upgrade to be aligned with rtx_lib.c?