IvanZuy / freertos_c28x

FreeRTOS port for TI C2000's C28x based microcontrollers
MIT License
71 stars 32 forks source link

Add support for enabling critical interrupts that do not use FreeRTOS API #8

Closed JF-soft closed 4 years ago

JF-soft commented 4 years ago

Hi Ivan, I tried to change the things you told me in the previous pull request. There are 3 separate commits, the first adds the interrupt feature, the second updates examples to get rid of Timer 2 setup code (as now is implemented in port.c) and in the final one the change of the name of porASM.asm to portasm.asm was made. I checked CCS project .dotfiles and it seems that porASM.asm is not used anywhere, but you can skip the last commit if needed.

Regards, Jonatan

IvanZuy commented 4 years ago

Hi Jonatan,

Thank you for converting PR into separate commits. I'll start reviewing it but it will take some time. I will have to refresh my memory about how it works and find my old TMS320 test setup :-)

Best regards, Ivan

IvanZuy commented 4 years ago

Hi Jonatan,

It seems like you removed porASM.asm but didn't add portasm.asm. In your master branch I don't see any *.asm files.

I also didn't understand the part of your comment where you're saying "I checked CCS project .dotfiles and it seems that porASM.asm is not used anywhere". How is that possible if porASM.asm was the main file where context switch is happening. I tried to compile blinky_tms320f28069 example from your master branch and got compiler error: gmake: *** No rule to make target 'C:/Ivan/ti_rtos/freertos_c28x_jfsoft/FreeRTOS/portable/CCS/c28x/porASM.asm', needed by 'FreeRTOS/port/porASM.obj'. Stop.

Maybe it's not in the .dotfiles(I'm not familiar with CCS) but somehow CCS knows where to look for porASM.asm. After renaming it has to be told that now it should look for portasm.asm.

Regards, Ivan

JF-soft commented 4 years ago

ok, let me check using CSS in my setup (I have installed newer versions of the tools in my machine), In the mean time, you can test up to the previous commit, since the change in the file name is in the last commit. If it is problematic we can left it out.

JF-soft commented 4 years ago

I checked again the file .project and in that file is the information for inclusion of porASM.asm into the project. What do I have to do to exclude the last commit from the pull request?

JF-soft commented 4 years ago

As a side note, the configuration used in projects is windows specific (it uses \ instead of / for path locations) and the location of some products is hardcoded in project options, for instance controlSUITE folder which is not used anymore by TI (they call it C2000Ware now). I can try to change some of those problems in a consistent way to make it compile smoothly (I'm using linux as development environment by the way), but it will take time and testing. I think we will be better off fixing the typo for now. Also, I only have a TMS320F28335 board to test for now.

IvanZuy commented 4 years ago

I think it will be easier to exclude last commit and I will rename file with adjusting projects settings by myself. As far as I understand you can create a branch based on commit 953c448 and make a PR from that branch.

JF-soft commented 4 years ago

I think I could do it, I did a git reset --hard HEAD^ and git push origin -f to force the delete of the commit, now I can't see the last commit in this pull request. So I think it would be good now. Sorry, and let me know if it is working for you.

JF-soft commented 4 years ago

don't worry about the file name, it was a nice thing to do but it is not critical, I don't want to create problems on your side. All projects should be updated to use latest TI tools, and it will be a nightmare, haha... I was able to compile the example you are using but it seems that there is an error about the ramfuncs zone on my side, TI changed some time ago the way they handle the ramfuncs zone, now they call it .TI.ramfunc and provide support for copying code into that zone automatically (support for this is in rts2800_fpu32.lib or rts2800.lib). Current linker script does not use this new name, so it throws an error. I don't know what is the best option to handle this.

IvanZuy commented 4 years ago

Great, I also see two commits now.

JF-soft commented 4 years ago

I think the most easy way is to locate context switch code in .text section for now.

IvanZuy commented 4 years ago

I tried blink examples for both F28069 and F28377 CPUs. If nesting is active both CPUs get stuck after 2 - 10 minutes. If I disable nesting they both work fine. When system is stuck it stays in the OS idle task. I believe timer interrupt is not working anymore but I didn't have a chance to look into it yet.

JF-soft commented 4 years ago

Let me check in my board, I was having that kind of problem in my board after a few seconds at first, I solved it placing portTICK_ISR in RAM and thought it was related to timing (I was using an eCAP isr that consumed in a delay loop most of the available cpu time), when it happened the cpu got stuck in the idle task with all interrupts disabled but the high priority one. I think the examples you mentioned are running from RAM so it must be something else. Also it happened to me when running from flash and there was too little spare time to run the OS. I think that you do not have any interrupt on group 4 in those examples, so setting that IER bit should do nothing, but you never know...

JF-soft commented 4 years ago

As stated here: https://processors.wiki.ti.com/index.php/Interrupt_Nesting_on_C28x the cpu automatically saves IER before entering interrupts, I think this problem can be related to entering critical sections and using the nesting counter (vPortEnterCritical). I will try to run similar examples to yours and try to find a solution. Thanks for your time and feedback.

JF-soft commented 4 years ago

Hi Ivan, I worked a little more in my attempt to support non kernel aware critical interrupts, I have something that works in preemptive and non preemptive mode, does not enables interrupts during context switch (I thought this could be possible but somehow it can't be done), and tried to keep it compatible with your examples when using portCRITICAL_INTERRUPT_MASK equal to zero, please tell me if it is working for you and I will work in bringing up an example on how to use this feature. I have something to ask you, you are using INTR INT14 for implementing portYIELD and OR IFR,#0x2000 for implementing portYIELD_FROM_ISR(x), I run into trouble when trying to assert a Timer 2 interrupt inside an interrupt that does not enable INT14 in IER, i.e. after using portYIELD_FROM_ISR(x) the flag gets set in IFR but the interrupt is not launched at the end of the currently running interrupt handler (in fact the flag is set and nothing happens when in Idle Task), and when I used that instruction with INT14 enabled, the interrupt handler was fired right away after using the instruction (which makes no sense as there is no difference in this behavior with the INTR INT14 instruction). As I understand, FreeRTOS does not expect the interrupt to suddenly finish when calling portYIELD_FROM_ISR(x), but at the end of that interrupt and of any other potentially nested interrupt. Did you have this kind of problems in your setup? I don't know, maybe is a bug on the TMS320F28335 I'm using...

JF-soft commented 4 years ago

Maybe this is the reason why uC-OS3, sysbios and others use RTOSINT for context switching, which has priority 4...

JF-soft commented 4 years ago

Hi Ivan, sorry for writing again so soon, but I think I finally found the problem. INTR INT14 instruction is a very complicated instruction to use, first, INT14 bit in IFR register is not cleared automatically when launching interrupt routine, second when using it inside of portYIELD() in a critical section, the status of IER was changed to critical only, and as you copied IER from one task to another inside portTICK_ISR this state of IER propagated to other tasks that were not in a critical section. As you only used INTM for disabling interrupts and IER bits did not change ever inside of any task, this was not a problem, but I think this is a violation of task separate and private state anyways. I removed that code, and initialized IER in task stack at task creation time, making sure INT14 was enabled. I got rid of the uggly hack for checking IFR and launching INT14 by hand. I think the code should be working now (at least it works for me, haha).

IvanZuy commented 4 years ago

Hi Jonatan,

I'm glad that you could find an solution. I'm going to try it tonight. Meantime could you please reorganize the commits. Ideally, I'd like to see two commits only, one for the nested interrupts feature and another one for updated examples to remove Timer 2 setup code. If you think there are something else what would be beneficial to have isolated in the separate commit, I'm ok with that.

Could you also make new feature disable by default. I'd like people to start from regular version first and only activate advanced features if they understand what they're doing. We can add description for this feature into read.me file.

JF-soft commented 4 years ago

Ok, no problem, I will revert multiple commits and prepare something that plays nice with the integration of the new feature into your repository, up to now I was only worried about getting the code working. I already reverted the critical interrupt mask to 0, I also think it is not a good idea to enable this feature as default, mainly because it depends on the application (in my opinion most of the applications should use it however, this DSP/MCU is good at implementing control loops, disabling critical interrupts globally makes this DSP loose its main functionality). I can't make up my mind yet on were to put that mask, ideally I would like to have a configuration in FreeRTOSConfig.h, but this don't play nice with including C header files in assembly code. Please note that IER initialization at task creation point is valid for the default version when disabling global interrupts. Please test the code and allow me a little more time to look at anything else we can improve. As a side note, I think that making static inline the functions for enabling and disabling nesting in kernel aware interrupts is a better option than implementing those functions in assembler (for instance in the case interrupts are placed into ramfuncs section, those functions would end up in that section too), so I will change that too. Regarding the README, I will write something for that too, I think it will be good to add it into the commit that adds the feature.

On a personal side, first I would like to thank you for your contribution of the port, and second I would like to tell you that some of the changes are inspired in the discussion from this thread: https://e2e.ti.com/support/microcontrollers/c2000/f/171/t/543745?FreeRTOS-port-for-C2000-TMS320F28xxx , the idea of using Timer 2 register to avoid global variables is from Mitja Nemec for instance. A rare coincidence is that Fernando Gatto is a friend of mine (in fact I know your repository thanks to him), I worked with him on the TMS320F28035 he was asking at that time, for implementing a power converter in 2018 and 2019, I was helping him with the control loop.

IvanZuy commented 4 years ago

I did some tests tonight. The first result is that it triggers stack overflow hook in the "Red" task after ~1minute of blinking. What I see in the debugger is that stack usage is stable for about 1 minute at the level of 0x57 bytes but when I hook fires I see another 22 bytes occupied. Weird thing is that those additional 22 bytes have a tskSTACK_FILL_BYTE bytes in the middle. That usually indicates stack corruption. I'm seeing this issue on TMS320F28069 dev kit. Task stack size is 128 bytes. If I increase the stack size to 256, I see the same behavior (unexpected usage increase) but hook doesn't trigger. After another 10 minutes system was still working and stack size was stable.

Regarding to your questions in the previous comments.

  1. If I correctly understood, You have INT14 disabled in IER then your execute portYIELD_FROM_ISR(x) which sets INT14 flag in IFR. When interrupt completed you don;t see INT14. I think that is correct behavior. Bits in IFR doesn't mean much without corresponding bit set in IER. Sorry if I miss understood your question.

  2. About yielding from critical section and transferring IER state between the tasks. I did that intentionally. I wanted to have a interrupts config system wide but not task specific. Initial implementation was saving IER state in the task context(similar to what your latest code is doing) but I remember having issues with that. I don't completely understand what you mean under " using portYIELD() in a critical section". Is that something like: enterCritical() ..... do something portYIELD() exitCritical() Looks very unusual :-) Maybe this is another miss understanding from my side.

After finishing the port, I tried to add similar feature, but was facing a lot of issues. Huge problem was that it's very hard to test so I gave up. Hopefully you have enough time and courage to complete it. :-)

Maybe it would make sense to move in small steps. For example as a first step we can add ability to keep not FreeRTOS interrupts active in the critical session. I believe this feature alone will be a big improvement in terms of ability to use this port for controls.

JF-soft commented 4 years ago

Ok, first thing we should do is disable back nesting in INT14, and test that condition. Correct behavior for scheduler would be launching ISR after finishing current and potentially nested interrupts when yielding from kernel aware ISR (https://www.freertos.org/FreeRTOS_Support_Forum_Archive/April_2018/freertos_portYIELD_FROM_ISR_question_1a9d6ee2j.html). I think the correct behavior is running the scheduler right at the end of all kernel aware interrupts, I mean the lowest priority kernel aware interrupt that got fired in the first place taking into account nesting.

Maybe we should check, but I'm pretty sure that enter_critical() portYIELD() exit_critical() happens in freertos sources, in task management code, that's the reason why an IER value set to critical propagated from one task to another. I thought this was intentional, because you used INTR instruction as portYIELD, which does not care if INTM is cleared or IER bits are set, it fires the interrupt right away. In this case, nothing should happen because OS would have to return to the point of portYIELD at some future point in time.

About global config of IER, no other OS keeps IER setting globally (take a look at uC-OS3 for instance: https://github.com/SiliconLabs/uC-OS3/blob/master/Ports/C28x/Generic/CCS/os_cpu_a.asm , and this other https://github.com/jdoe95/mrtos-portable-c28/blob/master/rtos_portable.asm we could check sysbios but it is very complex to read and follow the sources). I think the purpose of a mutithreading OS is to trick the thread or task to believe they own the hardware for themselves alone, sharing IER among threads would violate this assumption.

It seems I need to get better at testing, haha...

JF-soft commented 4 years ago

"Bits in IFR doesn't mean much without corresponding bit set in IER. Sorry if I miss understood your question." If an IFR bit is set, the dsp should trigger an interrupt when enabling INTM and IER bits (think of it as a series of switches), if this is not true, you would lose interrupts fired when servicing an interrupt. Instead of nesting, the processor has to do some form of tail chaining (I think in this case the save/restore of context in between is done anyways, so instead of tail chain is more like consecutive interrupts)

JF-soft commented 4 years ago

Take a look at tasks.c, function xTaskNotifyWait line 4288 issues taskENTER_CRITICAL(), then in line 4310 it issues portYIELD_WHITIN_API() which is defined as portYIELD() in FreeRTOS.h, so a yield inside a critical section is a common operation, at least for the kernel. Also read the comment:

/ All ports are written to allow a yield in a critical section (some will yield immediately, others wait until the critical section exits) - but it is not something that application code should ever do. /

Maybe we need to change portYIELD() to use OR IFR,#0x2000 instead of INTR INT14 (again, this instruction does not wait to INTM to be cleared or IER bit to be set to service the interrupt), as it will wait to exit the critical section for yielding inside kernel code (if TI docs are correct, they say for instance in spru430f.pdf that INTR instruction should clear IFR bit, but my experience is that the bit gets set for moments without being cleared in portTICK_ISR, it can't be a OR IFR operation as it is used at 100ms inside an interrupt for giving up the semaphore, and if a set a breakpoint in the instruction AND IFR inside portTICK_ISR it gets there just once in a while. Not clearing IFR register should be the behavior of the TRAP instruction instead)

JF-soft commented 4 years ago

FYI today I tested the code for stack overflow. As you say, 128 words are not enough and I get the a breakpoint hit at vApplicationStackOverflowHook. Changing the stack size to 192 works (be careful, the compiler reserves complete RAM pages for arrays, I tried 144 bytes and I was seeing a lot of memory positions with garbage between the end of one stack array and the next stack initial address, besides the 0x00A5 bytes FreeRTOS writes for memory checking). When using whole pages, I can see 0x00A5 up to the next array.

I removed the code for enabling nesting inside context switching just in case the high priority interrupt arrives in a read/modify/write operation. I also changed INTR INT14 instruction for OR IFR,#0x2000 as portYIELD(), and I'm still getting the IFR bit set inside the portTICK_ISR() (I think TI docs are correct, the problem possibly is that Timer 2 interrupt arrives when servicing a software interrupt from a portYIELD() or a portYIELD_FROM_ISR(x), and the processor enters portTICK_ISR() again just after getting out, in any case clearing IFR register is making the OS work, EDIT: if this is true, it would be better not clearing IFR bit as I would loose a timer tick, though this condition is very rare).

To speed up memory corruption (if any), I'm running an eCAP interrupt at 50us, using a DELAY_US(40) instruction inside of it, I run also blinky code for toggling a Red led every 50ms using a semaphore and inside the interrupt that gives the semaphore I use a DELAY_US(750) instruction to simulate some more work, I enable eCAP critical interrupt nesting here. Also I'm running the Blue led task that waits for 10ms to toggle the blue led. So far, I can't see memory corruption on blue and idle task stacks, the only problem I'm seeing is that in red task stack there are two memory words set as zero in a zone beyond a 0x00A5 memory position. In my app I have the following addresses for stack start address: red->0xC280, blue->0xC100, idle->0xC1C0. For the red task, the highest valid memory position is 0xC2EC, the positions were 0x0000 is found are 0xC2ED and 0xC2F1. This should be an indication of memory corruption, but, the stack growth is bounded, it never gets bigger than this (I ran the example for hours). The funny thing is that just to check, I removed my code and placed the original code that is in your repo and I'm seeing the exact same behavior (i.e, memory addresses are other, but there are two words set as zero at the same offset from the last 0x00A5 word).Those same two memory positions get set to 0x0000, so I think this behavior must be related somehow to the task that uses a semaphore yielded from an interrupt.

I haven't pushed these changes yet, as I'm doing some more tests, but the changes are not that big, removed code for nesting in portTICK_ISR and change INTR INT14 for OR IFR,#0x2000. After pushing the code Ivan, I would like you to perform some testing on your end just to make sure it is working.

After this, maybe I will try to test using RTOSINT as context switch interrupt and Timer2 as System Tick, but this change may take more time, and maybe is not that important for your repo. Please let me know what you think and thanks for your patience and your help.

Regards, Jonatan.

JF-soft commented 4 years ago

One additional thought from my testing today, even though FreeRTOS flags stack overflow, if I check memory contents I can't see overflow, in fact, if it were happening, running the program without the check would end up in program corruption and it would stop blinking at some point and that didn't happen to me. I think there's something weird going on here... I tested using 128 words stack for tasks and 256 words for idle task (yes I remembered changing config in vApplicationGetIdleTaskMemory), when hitting the breakpoint at vApplicationStackOverflowHook, I looked into the memory and there was available memory yet... I found that arrays are stored by the compiler using a 64 word boundary, and just in case, I placed stack memory for tasks into a separate section (in L6 RAM, which is empty besides stack variables) using the linker command file and the attribute (section("...")) compiler directive.

JF-soft commented 4 years ago

I think I found it, FreeRTOS defines this constant:

define tskSTACK_FILL_BYTE ( 0xa5U )

but what I'm seeing write into memory is 0x00A5, so there is a problem in how the processor handles data, writing 0xA5 into memory (as the minimum data size handled by the DSP is 16-bit uint8_t is treated as uint16_t, we should check memset() implementation) will end up in writing 0x00A5, so there is a bug in FreeRTOS checking algorithm for this platform that renders it unreliable, that's why increasing task stack size makes the OS not going to vApplicationStackOverflowHook. Please be careful when taking conclusions about stack overflow in c28x platform!... In summary, you don't see garbage in memory browser if you allocate arrays on a 64 word boundary, and FreeRTOS is writing 0x00A5 instead of 0xA5 into unused memory positions, also the checking algorithm is waiting for 8-bit data while c28x sends 16-bit data to this algorithm.

I think that besides memory positions set to 0x0000 with 0x00A5 words in between (this can be happening if using stack allocated variables in interrupts for instance, as it is common that isr generated code increments SP with 32-bit alignment for allocation of static variables, if one of those is 16-bit part of a 32-bit memory position is not written and left intact), the stack is bounded and not overflowing, so my implementation to allow critical interrupts seems to be working.

I also took out IFR bit checking to avoid loosing timer ticks, I will prepare the my repo for transfering the code to you.

EDIT: I was wrong about FreeRTOS bug, "The stack overflow hook function is called should any of these 16 bytes not remain at their initial value." (https://www.freertos.org/Stacks-and-stack-overflow-checking.html) I saw that worst case leds task used 113 or 115 words, setting task stack to 128 means there will be less than 16 words available so that's why stack overflow hook is reached. I will investigate later why stack usage increases. One thing I don't like from current implementation is that interrupt stack is shared with currently running task stack, this is no problem with blinky examples as most of the time you end up in idle thread, and interrupt processing does not enable nesting, taking a very short time, but for more complex applications, this can be a problem (when designing task stack, you would have to take into account the worst case of interrupt nesting in each task stack size, which is a waste of memory). This hardware has no MSP/PSP feature as cortex-m devices, so having a separate stack for interrupts would require to write some kind of generic interrupt handler in assembly to change stack pointer location in memory before giving control to a regular C function to handle the interrupt (I think this is the idea of the sysbios interrupt dispatcher)