FreeRTOS / Lab-Project-FreeRTOS-POSIX

This repository contains FreeRTOS+POSIX source code, which could be consumed as a submodule.
MIT License
105 stars 62 forks source link

timer_settime vs timer_delete concurrency issue #16

Closed fernandodabadia closed 2 years ago

fernandodabadia commented 2 years ago

Hi,

I found a concurrency issue in the implementation of the POSIX timer functions.

I was investigating a scenario where a Hard Fault occurred in the microcontroller due to the stack being corrupted. After a few days of analysis, I managed to get the exact scenario. A timer was being enabled after the timer delete command had been executed.

I confirmed that the application is performing the correct sequence call: timer_create  a timer_settime  timer_delete. In this specific case, the settime and delete were happening in sequence, with no interval, due to a specific product behavior. The task that calls the create, settime, and delete functions have a higher priority than the FreeRTOS timer task (prvTimerTask). For this scenario, the timer start and stop requests are in the prvTimerTask queue without being executed while the timer_delete function is already finished. The problem occurs because the timer_delete frees the memory allocated to the timer as soon as the function finishes and the condition for it to finish is that the timer is stopped. Note that as timer start and stop requests are still queued to be handled by prvTimerTask, so when timer_delete queries whether the timer is stopped, it gets a yes response and then deallocates the resources. After that, when prvTimerTask runs it handles the items in the queue and then starts a timer that no longer exists (memory has been deallocated) and memory corruption occurs.

In my analysis, the fix is simple, it should wait at the end of the timer_settime() function until the timer has already been created before returning from the function (as is verified in timer_delete if the timer is stopped before deallocating resources). Do you see any negative impact from this approach?

muneebahmed10 commented 2 years ago

Hi @fernandodabadia,

It sounds like you want timer_settime to wait until the timer has started again before returning, so that if delete is called afterwards then the timer won't be in the stopped state. This sounds reasonable to me, so I created PR #17 to implement this. Can you pull that PR and confirm that it resolves your issue?

Additionally, what is your motivation for using the POSIX wrapper instead of native FreeRTOS APIs? Using the native APIs would likely give you more control in your application, as this wrapper only implements a small subset of the POSIX API.

Thanks

fernandodabadia commented 2 years ago

Hi @muneebahmed10,

For me, the modifications of PR https://github.com/FreeRTOS/Lab-Project-FreeRTOS-POSIX/pull/17 are ok. I had already made these modifications in my local repo and run some tests, even before your answer, and the problem here was solved. The main purpose of the questioning was to see if there could be any negative impact that I might not be seeing. Thanks for the feedback, I'm pretty confident with the solution. I already approved the PR, should I do any other procedure?

Regarding the use of the POSIX wrapper, it's because in the product I'm working on it was need to change from another RTOS to FreeRTOS. The another RTOS was POSIX and using the FreeRTOS POSIX wrapper made porting the code easier. Another point that was also very important for us is the issue of portability and code reuse that POSIX makes possible. For this project, while the codebase using FreeRTOS was not ready, some modules were developed in linux and then ported to the product when the board is ready with FreeRTOS running.

muneebahmed10 commented 2 years ago

Hi @fernandodabadia,

Thanks for checking, we've gone ahead and merged #17. Since the POSIX wrapper only implements a subset of POSIX and using the native APIs results in less indirection, we wanted to ask to make sure that the wrapper fits your use case and provides a better developer experience.

As we've merged the PR, I'm going to go ahead and close this issue. Please open a new issue or PR if you have any further issue reports.

Thank you