ciaa / firmware_v1

Firmware de la CIAA
http://www.proyecto-ciaa.com.ar
126 stars 120 forks source link

ciaa-nxp, edu-ciaa-nxp SetEvent From ISR2 + WaitEvent from FULL preemptive task stops the system #457

Closed mabeett closed 1 year ago

mabeett commented 7 years ago

If the ISR2 UART calls SetEvent to a full preemptive Task the system will crash after some Handler invocations. This bug could be the cause of #456 and #455 or at lease one cause, since ciaaSerialDevices_txConfirmation() and ciaaSerialDevices_rxIndication() call to SetEvent() and this ones are called via ISR.

For triggering the bug in an isolated application use the attached code. This is a test project based on rtos_example example, build the project and send to USB UART a large non-stop string, this cuold be a larcge file via cat /dev/ttyUSB1 in linux or using a GUI as gtkterm or hyperterminal

blinky_uart_waitevent.zip

mabeett commented 7 years ago

related with #458 , The solution would be apply the afirmation for this issue.

mabeett commented 7 years ago

Days ago @gmuro commented in the mailing list a patch.

The patch is this:

modules/rtos$ git diff -- src/Schedule.c
diff --git a/src/Schedule.c b/src/Schedule.c
index 7b01dc6..a9e154a 100644
--- a/src/Schedule.c
+++ b/src/Schedule.c
@@ -159,14 +159,13 @@ extern StatusType Schedule
          /* set actual context task */
          SetActualContext(CONTEXT_TASK);

-         IntSecure_End();
-
 #if (HOOK_PRETASKHOOK == OSEK_ENABLE)
          PreTaskHook();
 #endif /* #if (HOOK_PRETASKHOOK == OSEK_ENABLE) */

          /* jmp tp the next task */
          JmpTask(nextTask);
+         IntSecure_End();
       }
       else
       {
@@ -197,8 +196,6 @@ extern StatusType Schedule
             /* set actual context task */
             SetActualContext(CONTEXT_TASK);

-            IntSecure_End();
-
 #if (HOOK_PRETASKHOOK == OSEK_ENABLE)
             PreTaskHook();
 #endif /* #if (HOOK_PRETASKHOOK == OSEK_ENABLE) */
@@ -206,6 +203,7 @@ extern StatusType Schedule
             /* \req OSEK_SYS_3.4.1.3 its context is saved */
             /* \req OSEK_SYS_3.4.1.4 and the higher-priority task is executed */
             CallTask(actualTask, nextTask);
+            IntSecure_End();
          }
          else
          {

I tested the attached example test program with te patch in an edu-ciaa-nxp v1.1 board and There are no crashes. I also tested the blinking_echo example with no crashes.

Please remember to send data via RS-232/FTDI-UART for getting the crash triggering.

Warning: Applying this patch will broke x86 porting.

mabeett commented 6 years ago

TODO: test glpuga branch features/cortexM4contextswitching

mabeett commented 6 years ago

TODO: test glpuga branch features/cortexM4contextswitching

Unfortunately the bug related with two UARTs receiving information persists =(

glpuga commented 6 years ago

Unfortunately the bug related with two UARTs receiving information persists =(

Ok, noted.

While working on this I didn't find any possible scenario under which more than a single UART interrupting the program would cause a problem different than the ones already caused to the single UART case because of the atomicity gap before JmpTask() / CallTask(), so currently I'm betting my money on that probably being a different bug not related to #458.

Just to confirm, did you try the test program with a single single UART receiving information? Did that perform as expected?

mabeett commented 6 years ago

Just to confirm, did you try the test program with a single single UART receiving information? Did that perform as expected?

5 days ago (with your question) I started testing: a) The test program with no UART activity: OK, passed. (more than 72 hours) b) The test program receiving UART data in RS232 connector: OK passed. (aproximately 24hours). c) The test program receiving UART data via debug port: OK passed (12 hours and counting).

While working on this I didn't find any possible scenario under which more than a single UART interrupting the program would cause a problem different...

Note this test application makes brute force in order to get the non-usual event which breaks the system. Unfortutenaly being a non usual condition is harder to take note about the bug.