EngineeringSpirit / FreeLwIP-Nios-II

FreeRTOS with LwIP integration in the Nios II EDS
19 stars 28 forks source link

Ignored interrupt context #6

Closed mluthi closed 10 years ago

mluthi commented 10 years ago

According to Altera's HAL API Reference (http://www.altera.com/literature/hb/nios2/n2sw_nii52010.pdf), the alt_irq_disable_all function returns a status/context value that should be used when making the corresponding alt_irq_enable_all call. The code of the current FreeRTOS port for Nios ignores this detail and always uses 0x1 as parameter for alt_irq_enable_all (see code excerpt from port.c below):

define portDISABLE_INTERRUPTS() alt_irq_disable_all();

define portENABLE_INTERRUPTS() alt_irq_enable_all( 0x01 );

I've known about this for quite a while, but we never experienced a problem that could have been attributed to this "hack". Last week we ran into an interrupt related problem and that's when I started to dig deeper on this issue. As one result of my investigations I did find out that there are occasions where alt_irq_disable_all gets called when interrupts are already turned off. In this case calling alt_irq_enable_all with 0x1 as parameter reenables interrupts earlier than expected which I presume will lead to problems sooner or later. A quick workaround did not fix my problem though and it later turned out that the reason for our problemes was to be found somewhere else. Nevertheless I'd like to implement a correct solution now (so I won't have to worry anymore about this issue in the future) and therefore I'm interested in your thoughts: 1) do you agree that there's a potential issue with the current code? 2) have you already thought about a potential bug fix?

Kind Regards, Matthias

EngineeringSpirit commented 10 years ago

Yes this is a bug, I have also known about this since the start, but never found any problems with it. For reentrent / recursive alt_irq_disable_all() calling you could use portENTER_CRITICAL() and portEXIT_CRITICAL() It would, I think also be possible to make fixed_alt_irq_disable_all and fixed_alt_irq_enable_all counting to they must be called an even times to before the interrupts are back getting enabled.

We could solve it by something like this:

diff --git a/nios2_freertos_port/port.c b/nios2_freertos_port/port.c
index f06262c..311bea5 100755
--- a/nios2_freertos_port/port.c
+++ b/nios2_freertos_port/port.c
@@ -81,6 +81,19 @@ static inline void prvReadGp( unsigned long *ulValue )
 }
 /*-----------------------------------------------------------*/

+static alt_irq_context lastContext;
+
+void fixed_alt_irq_disable_all()
+{
+       lastContext |= alt_irq_disable_all();
+}
+
+void fixed_alt_irq_enable_all()
+{
+       alt_irq_enable_all(lastContext);
+       lastContext = 0;
+}
+
 /*
  * See header file for description.
  */
diff --git a/nios2_freertos_port/portmacro.h b/nios2_freertos_port/portmacro.h
index 546dd8e..b853e34 100755
--- a/nios2_freertos_port/portmacro.h
+++ b/nios2_freertos_port/portmacro.h
@@ -110,8 +110,11 @@ __asm__( "\n\t.globl       save_context" );
 extern void vTaskEnterCritical( void );
 extern void vTaskExitCritical( void );

-#define portDISABLE_INTERRUPTS()       alt_irq_disable_all()
-#define portENABLE_INTERRUPTS()                alt_irq_enable_all( 0x01 );
+extern void fixed_alt_irq_disable_all( void );
+extern void fixed_alt_irq_enable_all( void );
+
+#define portDISABLE_INTERRUPTS()       fixed_alt_irq_disable_all()
+#define portENABLE_INTERRUPTS()                fixed_alt_irq_enable_all()
 #define portENTER_CRITICAL()        if (xTaskGetSchedulerState()) vTaskEnterCritical()
 #define portEXIT_CRITICAL()         if (xTaskGetSchedulerState()) vTaskExitCritical()
 /*-----------------------------------------------------------*/
mluthi commented 10 years ago

I'm not sure if this approach works as expected: in the "normal" case, lastContext will be different from 0 (typically 0x1) after the first call to alt_irq_disable_all (since interrupts were active prior to the call), i.e. any subsequent calls to alt_irq_disable_all will leave lastContext at this value and therefore the first call to alt_irq_enable_all will reenable interrupts even if there have been calls to alt_irq_disable_all while interrupts were already turned off

To me it looks like the only true solution will be to implement a small stack, which tracks the context values for each call. Something along the following lines:

define c_nMaxNumContext 100

static alt_irq_context arrContext[c_nMaxNumContext]; static int currentContextIndex = 0;

void fixed_alt_irq_disable_all() { arrContext[currentContextIndex] = alt_irq_disable_all(); ++currentContextIndex; // Must be done AFTER IRQs are disabled! }

void fixed_alt_irq_enable_all() { alt_irq_enable_all(arrContext[--currentContextIndex]); }

Your thoughts on this?

EngineeringSpirit commented 10 years ago

I'm not quite fond of the stack solution. Wouldn't it be possible to do it like this?

static volatile alt_irq_context lastContext;
static volatile int irqDisableCount;

void fixed_alt_irq_disable_all()
{
    alt_irq_context ctxt = alt_irq_disable_all();
    lastContext |= ctxt;
    ++irqDisableCount;
}

void fixed_alt_irq_enable_all()
{
    if (--irqDisableCount == 0)
    {
        alt_irq_enable_all(lastContext);
        lastContext = 0;
    }
}

This because I expect that between fixed_alt_irq_disable_all() and another call to fixed_alt_irq_disable_all() the irq_context actually won't change and stay 0. Aslong as the interrupts are getting back enabled again after the last call to fixed_alt_irq_enable_all()

It would also be possible to add a BSP option which let you choose between a stack or the last implementation. But I expect that the stack at position > 0 will always contain 0 values.

mluthi commented 10 years ago

Nice idea you've come up with since it avoids all possible problems with a potential stack overflow. The only caveat I see is the fact that Altera does not really tell what the return type 'alt_irq_context' represents. If it is indeed guaranteed to be just '1' or '0' then your solution should work fine. I'll open a service request with Altera and try to get more details on the nature of 'alt_irq_context'.

mluthi commented 10 years ago

Got Altera's answer in record time:

Description of alt_irq_disable_all(): • This routine inhibits all interrupts by negating the status register PIE bit.

Contents return: • It returns the previous contents of the CPU status register (IRQ context) which can be used to restore the status register PIE bit to its state before this routine was called.

You may find the information above in \nios2eds\components\altera_nios2\HAL\inc\sys\alt_irq.h.

Looking at the source code of 'alt_irq_enable_all' reveals that the exact behavior of the enable function depends on certain CPU configuration parameters and therefore your suggestion with the BSP option might be the way to go.

EngineeringSpirit commented 10 years ago

This fix doesn't work, when using a USART as STDOUT, STDERR. This is because the somewhere in the printf path the interrupts get disabled. Then the USART driver sends out the data under interrupt and waits till the data has been send.

We can OR the last context, but we can't make it reentrend. If you want reentrend critical sections I advice using vTaskEnterCritical and vTaskExitCritical.