Closed NirSonnenschein closed 6 years ago
@NirSonnenschein somewhere in the code, osErrorParameter
is captured, but it is not clear from where it comes, were you able to at least find out the function that is causing this error parameter. From the code snippet you shared above, I would guess connect() or anything missing there?
@kjbracey-arm @SeppoTakalo
I assume this is a debug profile build? The message comes from EvrRtxThreadError
, which is a hook to catch RTX errors.
If you could stick a breakpoint on that to get a stack backtrace, it would help - I can see about a dozen places that could possibly call it with (NULL, osErrorParameter), and it's not obvious what the culprit would be. Most of them are simply calling functions with a NULL thread pointer, plus a couple of others.
This is during the connect call, right?
Hi @0xc0170 and @kjbracey-arm, just to clarify, the error almost definitely happens in the connect call. we have a print before and after the call (in cases of success or failure) and when this happens we don't see any of the prints after connect.
as a general background this happens occasionally on our nightly tests (e.g. nightly for two nights ago had it, but last night didn't). the particular configuration which failed in this case was mbedOS compiled with armcc in debug mode. This issue doesn't seem to reproduce cleanly when testing locally. I'll try this again today, if I'm able to reproduce locally I can try to use a breakpoint. I can also provide the bin / elf for the image in question if that will help.
It seems moderately likely it might be the consequence of connection failure - some sort of teardown when giving up not going cleanly. Maybe you could encourage it by persuading connect failure - yank the cable at the crucial moment...
Hi @kjbracey-arm , Thanks for the tip, I'll try this if I'm not able to locally reproduce the issue by normal means
I've tried reproducing locally (including disconnecting the Ethernet wire during testing) and so far I have not been able to reproduce the issue. this seems to be more readily reproducible in the Jenkins test environment. when disconnecting the cable during the connect step the tests halt for a while (presumably waiting for HDCP to complete) and then fail (no crash observed).
Any chance it's this bug? https://github.com/ARMmbed/mbed-os/pull/5587
Can't immediately see why we'd hit it, but it is the same error printout.
small update, happens on gcc arm also (caught in debug): new interface created Thread 0x0 error -4: Parameter error
yes the issue seems to reproduce more easily in the Jenkins lab environment (happens there pretty often but I was not able to reproduce on the local network).
I'm having a similar / possibly the same issue during network stack init with mbed commit 4d81eadb2
using gcc-arm
on the EFR32FG12_BRD4254A
target.
Mentioned in #5579 and manually applied the patch from #5587 with no effect.
Error occurs at rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_thread.c:1349
in uint32_t svcRtxThreadFlagsSet (osThreadId_t thread_id, uint32_t flags)
|1346 // Check parameters │
│1347 if ((thread == NULL) || (thread->id != osRtxIdThread) || │
│1348 (flags & ~((1U << osRtxThreadFlagsLimit) - 1U))) { │
B+>│1349 EvrRtxThreadError(thread, osErrorParameter); │
│1350 return ((uint32_t)osErrorParameter); │
│1351 }
Serial output:
[INFO][brro]: PANID: 691
[INFO][brro]: NET_IPV6_BOOTSTRAP_AUTONOMOUS
[WARN][brro]: Security NOT enabled
0m[DBG ][core]: NS Root task Init
[0m
[DBG ][sck ]: Socket Tasklet Generated
[sck ]: Socket Task
Thread 0x0 error -4: Parameter error
Backtrace:
Breakpoint 3, svcRtxThreadFlagsSet (thread_id=0x0 <osRegisterForOsEvents>, flags=512) at ./mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_thread.c:1349
(gdb) bt
#0 svcRtxThreadFlagsSet (thread_id=0x0 <osRegisterForOsEvents>, flags=512) at ./mbed-os/rtos/TARGET_CORTEX/rtx5/RTX/Source/rtx_thread.c:1349
#1 0x0004f324 in SVC_Handler () at irq_cm4f.S:59
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) p thread
$9 = (osRtxThread_t *) 0x0 <osRegisterForOsEvents>
(gdb) p *thread
$10 = {id = 0 '\000', state = 0 '\000', flags = 4 '\004', attr = 32 ' ',
name = 0x53f41 <Reset_Handler> "H\200G\006I\aJ\aK\232B\276\277Q\370\004\vB\370\004\v\370\347\254\367", <incomplete sequence \371\135\100\005>, thread_next = 0x53f6d <WTIMER1_IRQHandler>,
thread_prev = 0x58c09 <HardFault_Handler()>, delay_next = 0x53f6d <WTIMER1_IRQHandler>, delay_prev = 0x53f6d <WTIMER1_IRQHandler>, thread_join = 0x53f6d <WTIMER1_IRQHandler>,
delay = 343917, priority = 109 'm', priority_base = 63 '?', stack_frame = 5 '\005', flags_options = 0 '\000', wait_flags = 343917, thread_flags = 343917,
mutex_list = 0x4f311 <SVC_Handler>, stack_mem = 0x53f6d <WTIMER1_IRQHandler>, stack_size = 343917, sp = 324519, thread_addr = 324535, tz_memory = 343917,
context = 0x5eb55 <FRC_PRI_IRQHandler>}
(gdb)
It appears something is prompting a SVC interrupt with an invalid thread ID, but i'm not sure how and haven't worked out how to catch it prior to execution yet.
Ta for the info!
That was enough to pin it down. (Despite the annoyance that debuggers keep failing to get through exception stack frames.)
It's an ordering error in the K64F driver - it's installing its interrupt handler in low_level_init
via
ENET_SetCallback(&g_handle, ethernet_callback, netif);
ethernet_callback
calls osThreadFlagsSet(k64f_enetdata.thread)
.
k64f_enetdata.thread
isn't initialised until later, so there's a brief window where a receive interrupt can happen and ethernet_callback will use a null thread ID.
This is not terribly harmful, but the "trap errors" thing in the debug build intercepts it, reasonably enough.
Possible fixes:
low_level_init
errors)ENET_SetCallback
until after thread init (means you might process packets received during init much later - effectively existing behaviour)ethernet_callback
check for thread id being NULL (same effect as previous)Which Ethernet driver? The LwIP one or the Nanostack one, or both?
That later debug print looks like border router so I'm assuming it is Nanostack's driver or both.
Hang on, your #5579 is actually about a Nanostack issue. Not K64F at all. Oh well, you've helped solve this issue.
So it seems that both pieces of code probably have the same flaw - calling osThreadFlagsSet before the thread is ready. Not identified the path to it with Nanostack yet.
Yep, yep, different cause but suspect it's the same flaw. I can open another issue if you'd like?
Looks like in NanostackRfPhyEfr32.cpp
callbacks are enabled at NanostackRfPhyEfr32.cpp#L374 and the thread isn't started until NanostackRfPhyEfr32.cpp#L468, will have a shot at reordering it and see if that helps.
I wonder what changed that this is now a runtime error / how many other things it is likely to effect.
This is only a runtime error with the RTX error trapping on, which is only in debug builds since 5.6 I think, unless that's changed. More people testing debug builds now?
The silent / nearly impossible to debug runtime error handling in release builds cost me almost a month of head bashing before I worked out #5155, I wouldn't be surprised at all if / hope that is the case.
Description
we see an error when connecting the Ethernet adaptor adaptor on K64f running MbedOS compiled under ARMCC. This doesn't always happen, but occurs quite often. when it occurs we see the following print: Thread 00000000 error -4: Parameter error
Bug
Target K64F
Toolchain: ARM (mostly on armcc)
Toolchain version:
mbed-cli version: 5.6 and 5.7
mbed-os sha: (
git log -n1 --oneline
)DAPLink version:
Expected behavior out code creates and Ethernet interface object calls the connect function. the connect doesn't succeed or return an org.
Actual behavior the test hangs and we see the following print: Thread 00000000 error -4: Parameter error
Steps to reproduce we run the following code as part of the network initialization for the K64F