Open MrOneTwo opened 2 years ago
Trying to debug this.
I have added CONFIG_SHELL_BACKEND_SERIAL=n
since I don't need it.
I see that the problem stems from the subsys/shell/shell.c:shel_vfprintf
doing
void shell_vfprintf(const struct shell *sh, enum shell_vt100_color color,
const char *fmt, va_list args)
{
__ASSERT_NO_MSG(sh);
__ASSERT(!k_is_in_isr(), "Thread context required.");
__ASSERT_NO_MSG(sh->ctx);
__ASSERT_NO_MSG(z_flag_cmd_ctx_get(sh) ||
(k_current_get() != sh->ctx->tid));
__ASSERT_NO_MSG(sh->fprintf_ctx);
__ASSERT_NO_MSG(fmt);
/* Sending a message to a non-active shell leads to a dead lock. */
if (state_get(sh) != SHELL_STATE_ACTIVE) {
z_flag_print_noinit_set(sh, true);
return;
}
...
}
The sh->ctx->state
has value of 255
and not the SHELL_STATE_ACTIVE
(2
). That means it execute z_flag_print_noinit_set(sh, true)
to set the sh->ctx->print_noinit
flag. It ends in atomic_or
. The argument in atomic_or
, atomic_t *target
has address of 0x28a85
. The ldrex
op tries to read from that address into r0
register. That's where my firmware crashes!
0x25d66 <atomic_or> mov r3, r0
0x25d68 <atomic_or+2> dmb ish
0x25d6c <atomic_or+6> ldrex r0, [r3] .................. CRASH!
0x25d70 <atomic_or+10> orr.w r2, r0, r1
0x25d74 <atomic_or+14> strex r12, r2, [r3]
0x25d78 <atomic_or+18> cmp.w r12, #0
0x25d7c <atomic_or+22> bne.n 0x25d6c <atomic_or+6>
0x25d7e <atomic_or+24> dmb ish
0x25d82 <atomic_or+28> bx lr
0x25d84 <atomic_and> mov r3, r0
0x25d86 <atomic_and+2> dmb ish
0x25d8a <atomic_and+6> ldrex r0, [r3]
0x25d8e <atomic_and+10> and.w r2, r0, r1
0x25d92 <atomic_and+14> strex r12, r2, [r3]
0x25d96 <atomic_and+18> cmp.w r12, #0
0x25d9a <atomic_and+22> bne.n 0x25d8a <atomic_and+6>
In cmd_bonds
printing to shell is done with sh
pointer which points to shell_rtt
(and this printing works) but in the bond_info
the shell
is:
(gdb) p shell
$24 = (const struct shell *) 0x0 <_vector_table>
???
ctx_shell
in shell/bt.c
is invalid because the cmd_init
never runs! Running bt init
manually actually runs that function which binds ctx_shell
.
I've created a PR with a simple solution https://github.com/zephyrproject-rtos/zephyr/pull/41844
I downgraded this to an Enhancelent/Feature request, since the current Bluetooth shell has been intentionally designed with the assumption that the Bluetooth stack is always initialised through the shell. Please also see my comments in the linked PR.
I also just hit this (again).
the current Bluetooth shell has been intentionally designed with the assumption that the Bluetooth stack is always initialised through the shell
Not disagreeing that that's how it is, but it's a really weird design, right? Shell applications are first and foremost debugging tools, if I'm debugging some bluetooth issue I'm not gonna start rewriting my application to no longer initialize the bluetooth stack by itself. bt
is the only shell command I'm aware of that works like this.
I downgraded this to an Enhancelent/Feature request, since the current Bluetooth shell has been intentionally designed with the assumption that the Bluetooth stack is always initialised through the shell. Please also see my comments in the linked PR.
Just had a quick look at this.
Does it even make sense that bt_foreach_bond
calls the callback before BT has been enabled?
The main issue here (in the shell) is that callbacks always assume that ctx_shell
has been, which clearly is not the case.
Rather than trying to fix all the different places where ctx_shell
can be incorrectly used, we should really just get rid of it as requested by https://github.com/zephyrproject-rtos/zephyr/issues/70945
main
ref in west.yml) - c51aa88046ca5c9d687e3e6802b32ddd57dd6de4My project is mostly (using the Just Works pairing method by not running
bt_conn_auth_cb_register
) the same asperipheral_hids
. The problem is that:nrfjprog -e
Then I run following commands in the RTT shell:
Running
addr2line -e zephyr/zephyr.elf 0x00026076
puts me here/home/mc/gits/magknob-zephyr-ble/zephyr/include/sys/atomic_builtin.h:243
Assume I also paired the board with an iPhone. Then running
bt bonds
crashes the board also.The address points me to
/home/mc/gits/magknob-zephyr-ble/zephyr/include/sys/atomic_builtin.h:243
again.The crash doesn't happen if I comment out the
shell_print
in this function (which gets run bycmd_bonds
):In general I can pair with iPhone for the first time (when prompted for pairing) but after disconnecting and trying to reconnect I get:
That's for Just Works. When using setup reconnecting works fine. I need to pair without any pin though. This might be a separate issue but maybe connected...?