ciaa / firmware_v1

Firmware de la CIAA
http://www.proyecto-ciaa.com.ar
129 stars 120 forks source link

getNextTask() in Schedule_Int returns invalid task IDs on some compilers #446

Closed glpuga closed 8 years ago

glpuga commented 8 years ago

While debugging the LEON 3 port of the firmware I noticed that the when running the example rtos_example (attached to this issue) Schedule_Int calls JmpTask() with an invalid task Id as argument.

I tried looking for the cause to see if it was a bug in my code, but now I think that I traced back the root of the issue to the fact that the variable ReadyVar[] seems to be used but never initialized anywhere in the code. Because of this getNextTask() returns a non-existant task id when called from Schedule_Int in some cases when it should be returning INVALID_TASK instead.

Not all the the platforms are equally affected. At least the ia64 port seems to work just fine because even though the table is never initialized, all of its elements are indeed 0 when the program starts running. Since this issue has not been noticed before, I guess that something like this happens with the Cortex ports as well.

The compiler I'm using is a gcc cross compiler for SPARC architectures.

I attached both the example that I was running, and the screen capture of two debugging sessions of that example program. The one on the left is using the ciaa_sim_ia64 port, while the one on the right is the LEON 3 port one that I am still working on.

It can be seen that while on the ia64 session, getNextTask() returns 255 (INVALID_TASK), same code compile using the LEON compiler returns 187 causing Schedule to call JmpTask(). At the bottom the contents of ReadyVary are visible for both platforms.

dualdebug

rtos_example.zip

mcerdeiro commented 8 years ago

I never heard about port for LEON 3 so I have to support that it may be incomplete.

Anyway I guess the problem is that the variable ReadyVar is not zero. Per default global variables shall be set to 0 in C90. So ReadyVar per default shall be an array of many queues which are all empty. In your LEON 3 port it seems not to be the case. This may happens when the startup code tries to optimize the startup time and does not set all global variables to zero. Is a common behaviour in embedded ECUs but is not conform to C.

Please do the following, set a break point in main, reset the board and when main is reached check the content of ReadyVar. If is not all 0 values something is wrong during the initialization. :(

glpuga commented 8 years ago

Did as you said and indeed I found that the compiler was not to blame for the problem. When execution is stopped exactly at the start of main() the structure is initialized to 0.

Kept digging and found that there was a memory corruption problem very early in the initialization code for my port, a buffer overrun during an interrupt handling routine that was causing parts parts of adjacent data structures to be partially overwritten.

My fault, I'm sorry.

As far a I'm concerned this issue is ready to be closed.

mcerdeiro commented 8 years ago

Good to know that you found the solution to the problem. :)

Am 11.10.2016 7:43 PM schrieb "Gerardo Puga" notifications@github.com:

Closed #446 https://github.com/ciaa/Firmware/issues/446.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ciaa/Firmware/issues/446#event-819855318, or mute the thread https://github.com/notifications/unsubscribe-auth/AG7km6O7lS9_rTi7MMqM5WiH8ve0Zs3qks5qy8q6gaJpZM4KRr53 .