Samraksh / eMote

eMote OS -- Multiple Ports (using .NET MF v4.3)
0 stars 0 forks source link

Create test for interrupt processing delay #68

Closed Nathan-Stohs closed 7 years ago

Nathan-Stohs commented 10 years ago

Need a formal test (to be included in test suite) for minimum interrupt processing time. Hardware interrupt preferred, but software triggered interrupt might be OK.

Suggested implementation (just a suggestion, many many ways to do this): Two GPIOs shorted together (via wire). GPIO A is output mode, GPIO B is input. Create interrupt for GPIO B that un-sets GPIO A. Set GPIO A in software. Measure time A is high, preferably with logic analyzer but using internal Timers would also be OK.

Many many other ways to do it... could also for example bake something into the UART interrupt and watch when bytes come in.

We need this test because I suspect we have a problem with our interrupts being too heavy.

This issue up for grabs. Add yourself if you want to take it on but only when you plan to start it.

MichaelAtSamraksh commented 10 years ago

Also whoever takes this should check if measuring interrupts could be related to Mukundan's real-time extension work in measuring jitter and delays. Lovely data plots.

Nathan-Stohs commented 10 years ago

Performed interrupt timing measurements around the UART. MF performance was bad although I'm still debating if this is the smoking gun I was looking for.

I wrote a native C app as reference. I ran at 8 MHz and 48 MHz. Data is on the chart. My reference app had a similar UART handler but obviously did not include the queue/dequeue work and other MF processing.

I then did a similar measurement with MF. Long story short, in MF at 48 MHz we spend almost as much time in the interrupt handler as my reference handler did at 8 MHz. The top line number is 28 microseconds (1344 clocks) spent in the handler per byte. The second chart has a breakdown of time spent.

Running MF at 8 MHz., we spent a crushing 127 microseconds per TX byte in the interrupt handler. RX byte interrupts were completely broken and failed my test entirely (which gives me some promising follow-up debugging). I haven't been able to find any figures on what sort of service latencies are needed to handle a given UART speed but I have no problem believing this is not working at 115200 baud. Also consider that in the worst case there are other interrupts (such as timer) adding to the load which could increase latency further. The wire protocol does not seem robust, I wouldn't be surprised if one dropped byte is all it took to bring it all down.

Deeper results analysis:

The bottom portion labeled UART Handler is the time actually spent doing the basics and so is grossly similar to my reference. Curiously it is 33% larger but this can sort of be hand-waved away for the moment.

The middle blue portion is the time spent in the MF PAL layer doing the byte queue/dequeue operations. Yes, this is done inside the interrupt (MF's design, not ours). This is especially high for byte RX events. These are queue/dequeue operations for garbage bytes, potentially this delay could be higher in the event that real processing is done inside the interrupt. So this is also somewhat troubling.

Finally there is a 7.08 microsecond penalty that is endemic to the interrupt driver. Although this is not huge, this is probably the worst because this is a consistent across-the-board cost for every single interrupt every time. It is almost pure fat.

Remember that the measurements above were taken at 48 MHz... they are bad enough and our goal is to get 8 MHz working well.

Conclusions: The PAL processing inside of the interrupt handlers is a problem. The HAL interrupt driver has a serious inefficiency with global impact. Find out why RX interrupts stop working entirely in 8 MHz mode (but TX is OK).

Mukundan, what is the impact vis-a-vis your Real-Time work? Is there any value to re-running any of your old tests?

int1 int2 int3

Nathan-Stohs commented 10 years ago

Expanded my testing a little to look at all interrupts this morning. This is the last I’ll be doing dotNOW stuff for the week.

In MF At 8 MHz Over 10.432 seconds I measured 56708 interrupts (one every 184 microseconds) for a total of 86.3% of all CPU time spent in interrupt handlers. The mote was totally idle, no C# loaded. At 48 MHz over the same period, I got 158885 interrupts and the interrupt duty-cycle was again 86.3%. I loaded up the test LCD application at 48 MHz and yet again got 86.3%. Keep in mind that with our current interrupt driver, interrupts are expensive, driver overhead is 42 microseconds at 8 MHz.

I would infer that 86.3% represents some sort of maximum produced by the MF architecture (e.g. the remainder of the time is switching between interrupts or interrupts disabled). This is a pretty major bug.

Possibilities: 1) A low-priority interrupt is not getting cleared and we are stuck in it. 2) A set of interrupts repeat and take long enough to execute that they are continuously run in a loop.

The usual suspect would be something involving the timer but it isn't necessarily so. The direct debug would be to figure out which handlers are coming up. Unfortunately I don't know an elegant way to do such a trace but it should be no problem with a little code instrumentation and grunt work.

At a minimum this means the mote would never sleep. I'm assuming there are also performance implications and this is deeply linked to our other current dotNOW issues, but it could be that this is a low-priority interrupt that somehow isn't bothering normal operation.

Nathan-Stohs commented 10 years ago

Added simple profiling code. The badboy was TIM2 and Tasklets. I'm assuming that TIM2 is the core issue.

Chris recently put in some fixes as part of the C# COM update. This has helped somewhat but this is still broken.

MichaelAtSamraksh commented 9 years ago

Can anybody point to a summary of how we already profile CPU time in different CPU modes?

Nathan-Stohs commented 9 years ago

Not a pointer, but I can pass by value.

A formal performance profiling system, to be used with the testrig, probably should be in the works.

MichaelAtSamraksh commented 9 years ago

This came up in the call yesterday. to re-iterate: bonus for:

example of a long stack (probably not the deepest):

Thread [1] (Suspended: Signal 'SIGINT' received. Description: Interrupt.)
    25 SPI_I2S_GetFlagStatus() stm32f10x_spi.c:773 0x08028c2e
    24 CPU_SPI_ReadWriteByte() netmf_spi.cpp:302 0x0802cba8
    23 RF231Radio::WriteRegister() RF231.cpp:927 0x08027456
    22 RF231Radio::ClearChannelAssesment() RF231.cpp:1160 0x08027c32
    21 csmaMAC:: SendToRadio() csmaMAC.cpp:311 0x08025604
    20 SendFirstPacketToRadio() csmaMAC.cpp:30 0x0802578c
    19 csmaMAC:: Send() csmaMAC.cpp:245 0x0802578c
    18 csmaMAC:: SendHello() csmaMAC.cpp:53 0x080257ea
    17 VirtualTimerCallback() VirtualTimer.cpp:316 0x0802f090
    16 ISR_TIM1() netmf_advancedtimer.cpp:306 0x08026216
    15 ISR_TIM1() netmf_advancedtimer.cpp:297 0x08026216
    14 HAL_CALLBACK::Execute() AsyncProcCalls_decl.h:40 0x0802d532
    13 TIM1_CC_IRQHandler() STM32__AITC.cpp:1040 0x0802deb2
    12 <signal handler called>()  0xfffffff9
    11 SmartPtr_IRQ:: Probe() SmartPtr_thumb2.cpp:66 0x0802845e
    10 Events_WaitForEvents() events.cpp:170 0x08025ce4
    9 CLR_RT_ExecutionEngine::WaitSystemEvents() Execution.cpp:3823 0x0803044c
    8 CLR_RT_ExecutionEngine:: DebuggerLoop() Execution.cpp:3127 0x0803205c
    7 CLR_RT_ExecutionEngine::WaitForDebugger() Execution.cpp:616 0x0803209c
    6 CLR_DBG_Debugger:: Debugger_Discovery() Debugger.cpp:87 0x08020b0c
    5 Initialize() CLRStartup.cpp:79 0x080203be
    4 ClrStartup() CLRStartup.cpp:686 0x080203be
    3 ApplicationEntryPoint() tinyclr.cpp:19 0x080202e6
    2 BootEntry() tinyhal.cpp:624 0x0802a0d8
    1 EntryPoint() FirstEntry.s:71 0x08020206
Nathan-Stohs commented 7 years ago

The original issue, to create a test for interrupt processing delay, was satisfied a long time ago by the Benchmark TestSuite I wrote. So I'm going to put this one to bed.

KennethAtSamraksh commented 7 years ago

So what is the number as of 9/28/16? That is what is the interrupt processing delay in physical units. When was the last time the test was run?

Kenneth W. Parker, Ph.D.

From: Nathan Stohs [mailto:notifications@github.com] Sent: Tuesday, September 27, 2016 8:22 PM To: Samraksh/MF MF@noreply.github.com Subject: Re: [Samraksh/MF] Create test for interrupt processing delay (#68)

The original issue, to create a test for interrupt processing delay, was satisfied a long time ago by the Benchmark TestSuite I wrote. So I'm going to put this one to bed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Samraksh/MF/issues/68#issuecomment-250037258, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AA-4xAxWXJwVr5pkECUAx36wwTNsCsZyks5qubMdgaJpZM4BfEJI.