eclipse-threadx / threadx

Eclipse ThreadX is an advanced real-time operating system (RTOS) designed specifically for deeply embedded applications.
https://github.com/eclipse-threadx/rtos-docs/blob/main/rtos-docs/threadx/index.md
MIT License
2.87k stars 782 forks source link

tx_thread_wait_abort() API wipes a semaphore in exceptionally rare case #221

Closed t-yabui closed 1 year ago

t-yabui commented 1 year ago

Describe the bug

tx_thread_wait_abort() wipes a semaphore in exceptionally rare case.

  1. Thread_A is waiting a semaphore by tx_semaphore_get() API.
  2. Thread_B puts semaphore by tx_semaphore_put() or tx_semaphore_ceiling_put() API.
  3. A interrupt handler (such as the H/W timer) uses tx_thread_wait_abort() API to release Thread A at the time.

In consequence :

In spite of the above, the semaphore is empty in exceptionally rare case. In normal case, the semaphore is added. I expect it.

To Reproduce

The reproduction is very difficult because the timing is severe. However, if you inserts a delay loop in the place of the following codes, it become easy to reproduce.

https://github.com/azure-rtos/threadx/blob/4e62226eeaf870827facbeef40bd1767db5cf9f0/common/src/tx_semaphore_put.c#L203-L206

as below :

        TX_RESTORE

        for(volatile int dummy=0 ; dummy < 100000 ; dummy++);

        /* Resume thread.  */
        _tx_thread_system_resume(thread_ptr);

When tx_thread_wait_abort() API wakes up Thread_A after TX_RESTORE, may tx_semaphore_put() API not wake up Thread_A with TX_SUCCESS?

goldscott commented 1 year ago

Hi @t-yabui I am able to re-create what you are observing. We are working on a fix.

goldscott commented 1 year ago

Hi @t-yabui - we have a fix. The code will be pushed to github sometime in the next month. If you would like it immediately, see attached. tx_thread_wait_abort.txt Please let us know if this works for you.

t-yabui commented 1 year ago

@goldscott Thank you for the modification. I confirm attached code. As a result, tx_semaphore_get() and other APIs(*) returns expected result. I think this problem has been solved.

(*) I tested tx_event_flags_get() tx_byte_allocate() and tx_block_allocate().

TiejunMS commented 1 year ago

Fixed in 6.2.1 release.