Open doodek opened 2 weeks ago
possibly related #49066
Doesn't make sense what you are getting, that is either called from https://github.com/zephyrproject-rtos/zephyr/blob/4c9b30836770cba93a0212bc3fae5de20ba6e596/subsys/mgmt/mcumgr/smp_shell.c#L146 or https://github.com/zephyrproject-rtos/zephyr/blob/4c9b30836770cba93a0212bc3fae5de20ba6e596/subsys/mgmt/mcumgr/smp_uart.c#L41 both of which should be word aligned, it then directly uses the buf object, which again as can be seen from https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/drivers/console/uart_mcumgr.h#L26 is word aligned since the void * object will be 4 bytes, will try this on an nrf51 board later to check but I'm pretty sure I've tested that recently over UART and it's worked fine
If you could put a breakpoint on those calls and see what address the buf and buf->data fields have it would be helpful
Cannot reproduce on a cortex m0, the address is word-aligned:
Breakpoint 1, mcumgr_serial_process_frag (rx_ctxt=rx_ctxt@entry=0x200019a8 <smp_shell_rx_ctxt>, frag=0x20005270 <net_buf_data_smp_shell_rx_pool> "\006\tABUKAAALAAkBAKFkYXJndoFjbG9sUQ8=\nnVwdGltZemC\n\377\263\177\001`\367-\001\004\337\357\023\004\337\377", frag_len=35) at /tmp/aa/zephyr/subsys/mgmt/mcumgr/transport/src/serial_util.c:77
77 if (rx_ctxt->nb == NULL) {
(gdb) p/x frag
$1 = 0x20005270
If the pointer is not always guaranteed to be correctly aligned, Zephyr already has a helper function sys_get_be16()
that you should probably use.
If the pointer is not always guaranteed to be correctly aligned, Zephyr already has a helper function
sys_get_be16()
that you should probably use.
Network buffers should be properly aligned though? https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_NET_BUF_ALIGNMENT
Alignment restriction for network buffers. This is useful for some hardware IP with DMA that requires the buffers to be aligned to a certain byte boundary, or dealing with cache line restrictions. Default value of 0 means the alignment will be the size of a void pointer, any other value will force the alignment of a net buffer in bytes.
Network buffers should be properly aligned though? https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_NET_BUF_ALIGNMENT
That just guarantees you that the beginning of the net_buf payload buffer is aligned. Depending on what else is in the buffer or if there's e.g. some reserved headroom, buf->data can point at any position within the payload buffer. I'm not familiar with this specific subsystem or its use of buffers, but in general if you're parsing binary data received from a remote device, making alignment assumptions is a recipe for fragile code.
Network buffers should be properly aligned though? https://docs.zephyrproject.org/latest/kconfig.html#CONFIG_NET_BUF_ALIGNMENT
That just guarantees you that the beginning of the net_buf payload buffer is aligned. Depending on what else is in the buffer or if there's e.g. some reserved headroom, buf->data can be point at any position within the payload buffer. I'm not familiar with this specific subsystem or its use of buffers, but in general if you're parsing binary data received from a remote device, making alignment assumptions is a recipe for fragile code.
This is using the start of the buffer, the first byte is what is being looked at here. Shell -> network buffer -> buffers up until full length payload is received -> base64 decode of data to new network buffer -> check op-code (first byte) crash here on the OP's device, no crash on mine
Code:
void smp_shell_process(struct smp_shell_data *data)
{
struct net_buf *buf;
struct net_buf *nb;
while (true) {
buf = net_buf_get(&data->buf_ready, K_NO_WAIT);
if (!buf) {
break;
}
nb = mcumgr_serial_process_frag(&smp_shell_rx_ctxt,
buf->data,
buf->len);
if (nb != NULL) {
zephyr_smp_rx_req(&smp_shell_transport, nb);
}
...
struct net_buf *mcumgr_serial_process_frag(
struct mcumgr_serial_rx_ctxt *rx_ctxt,
const uint8_t *frag, int frag_len)
{
...
op = sys_be16_to_cpu(*(uint16_t *)frag); <--- here
@nordicjm thanks, I can see that. An equally important place to consider is this: https://github.com/zephyrproject-rtos/zephyr/blob/99e6280d7e22552de9a94992b626acdcbde00fee/subsys/mgmt/mcumgr/transport/src/smp_shell.c#L149-L167
There we can see that between allocation and inserting the buffer into a fifo all the code is doing is net_buf_add_u8()
which will not modify the buf->data
pointer, i.e. it should still point at the beginning of the payload buffer.
Either way, the code looks IMO fragile since mcumgr_serial_process_frag()
receives a uint8_t *
and not a uint16_t *
, i.e. at the very least I'd place an assert in there to indicate the assumptions it makes about the alignment of the pointer.
Btw, the references in this report seem to be against some older version of Zephyr, since the c-files in question are found in a different location in the current main branch. Maybe that's one potential reason for the inability to reproduce the issue?
Edit: the report does say "Zephyr OS 3.7.0"
Describe the bug Hard-Fault arising from unaligned access at mgmt/mcumgr/serial_util.c:87
The (see below) workaround, ensuring aligned access, has fixed this for me.
Workaround Replace
op = sys_be16_to_cpu(*(uint16_t *)frag);
at the line specified, with the following:Reproduce Honestly no idea how to make this misaligned manually. Possibly try to use MCUmgr update utility over shell over UART, and then inject an unaligned pointer in
frag
argument with a debugger?Expected behavior No hard faults
Impact Showstopper - can't deploy a device without a working update mechanism
Logs and console output Suspicious mcumgr comm dump
Stack trace:
ESF:
Environment (please complete the following information):
Additional context MCUmgr over Shell over UART, 9600 baudrate prj.conf extract: