zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.45k stars 6.4k forks source link

CONFIG_NET_NATIVE_OFFLOADED_SOCKETS fails after DTLS handshake #75909

Closed brandon-exact closed 1 month ago

brandon-exact commented 1 month ago

Describe the bug

I am trying to get CONFIG_NET_NATIVE_OFFLOADED_SOCKETS to work with lwm2m and I have noticed that I am able to do the handshake successfully then it seems to get stuck in a loop.

To Reproduce

I modified the lwm2m_client sample to use the offloaded sockets API as shown. I also am using the leshan public server. I added the device to it using the default credentials in the sample.

brandon@brandon:~/fork/zephyr/samples/net/lwm2m_client (main)$ git diff
diff --git a/samples/net/lwm2m_client/boards/native_sim.conf b/samples/net/lwm2m_client/boards/native_sim.conf
index 3e21f837f7c..45c5ead6689 100644
--- a/samples/net/lwm2m_client/boards/native_sim.conf
+++ b/samples/net/lwm2m_client/boards/native_sim.conf
@@ -1,5 +1,18 @@
 CONFIG_DNS_RESOLVER=y
-CONFIG_DNS_SERVER_IP_ADDRESSES=y
-CONFIG_DNS_SERVER1="192.0.2.2"
 CONFIG_LWM2M_DNS_SUPPORT=y
-CONFIG_NET_CONFIG_MY_IPV4_GW="192.0.2.2"
+CONFIG_ETH_NATIVE_POSIX=n
+CONFIG_NET_DRIVERS=y
+CONFIG_NET_SOCKETS=y
+CONFIG_NET_SOCKETS_OFFLOAD=y
+CONFIG_NET_NATIVE_OFFLOADED_SOCKETS=y
+CONFIG_HEAP_MEM_POOL_SIZE=1024
+
+# IPv6 DAD requires lower level network interface access, below exposed socket-level access
+CONFIG_NET_IPV6_DAD=n
+
+
+CONFIG_DEBUG=y
+
+CONFIG_MBEDTLS_DEBUG=y
+CONFIG_MBEDTLS_MEMORY_DEBUG=y
+CONFIG_MBEDTLS_LOG_LEVEL_DBG=y
\ No newline at end of file
diff --git a/samples/net/lwm2m_client/prj.conf b/samples/net/lwm2m_client/prj.conf
index 0911fdf34e1..7c3425405d6 100644
--- a/samples/net/lwm2m_client/prj.conf
+++ b/samples/net/lwm2m_client/prj.conf
@@ -39,3 +39,6 @@ CONFIG_NET_CONFIG_MY_IPV4_GW="192.0.2.2"
 # Longer endpoint name might be returned in a registration reply
 CONFIG_COAP_EXTENDED_OPTIONS_LEN=y
 CONFIG_COAP_EXTENDED_OPTIONS_LEN_VALUE=40
+
+
+CONFIG_LWM2M_APP_SERVER="coaps://leshan.eclipseprojects.io:5684"
cd zephyr/samples/net/lwm2m_client/
west build -b native_sim -- -DEXTRA_CONF_FILE="overlay-dtls.conf"

Logs and console output

It is able to resolve the domain and perform the handshake. image It seems that a mbedtls_ssl_read() is called in a infinite loop. I copied the last few steps of the handshale...

00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:4525: client state: MBEDTLS_SSL_SERVER_FINISHED
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:8645: => parse finished
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:8364: => calc finished tls
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_tls.c:8381: dumping 'finished output' (32 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_tls.c:8381: 0000:  a7 05 47 34 84 1a 4d 78 5e 0e 95 b7 09 e0 fe e2  ..G4..Mx^.......
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_tls.c:8381: 0010:  30 e3 e1 a6 fe 4d ef 1d 74 21 87 71 ef 6a cf c4  0....M..t!.q.j..
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8391: dumping 'calc finished result' (12 bytes)
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8391: 0000:  e5 b4 83 5f 9d 51 b3 61 fe c2 eb c6              ..._.Q.a....
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:8395: <= calc finished
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:4189: => read record
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2155: => fetch input
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2190: next record in same datagram, offset: 14
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2201: in_left: 53, nb_want: 13
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2209: <= fetch input
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3855: dumping 'input record header' (13 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3855: 0000:  16 fe fd 00 01 00 00 00 00 00 00 00 28           ............(
[00:00:00.930,003] <inf> mbedtls: ssl_msg.c:3857: input record: msgtype = 22, version = [0x303], msglen = 40
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3964: dumping 'input record from network' (53 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3964: 0000:  16 fe fd 00 01 00 00 00 00 00 00 00 28 00 01 00  ............(...
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3964: 0010:  00 00 00 00 00 f6 80 5a 97 c7 5c 4f 23 62 9a cf  .......Z..\O#b..
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3964: 0020:  ca 4e 81 31 4d f6 f8 f5 08 11 8e 8f 61 41 1b 4c  .N.1M.......aA.L
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:3964: 0030:  92 51 77 2b b9                                   .Qw+.
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:1510: => decrypt buf
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1607: dumping 'additional data used for AEAD' (13 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1607: 0000:  00 01 00 00 00 00 00 00 16 fe fd 00 18           .............
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1616: dumping 'IV used' (12 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1616: 0000:  60 de 1e e5 00 01 00 00 00 00 00 00              `...........
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1617: dumping 'TAG used' (8 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1617: 0000:  41 1b 4c 92 51 77 2b b9                          A.L.Qw+.
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1992: dumping 'raw buffer after decryption' (24 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1992: 0000:  14 00 00 0c 00 03 00 00 00 00 00 0c e5 b4 83 5f  ..............._
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:1992: 0010:  9d 51 b3 61 fe c2 eb c6                          .Q.a....
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2121: <= decrypt buf
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:4052: dumping 'input payload after decrypt' (24 bytes)
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:4052: 0000:  14 00 00 0c 00 03 00 00 00 00 00 0c e5 b4 83 5f  ..............._
[00:00:00.930,003] <dbg> mbedtls: zephyr_mbedtls_debug: ssl_msg.c:4052: 0010:  9d 51 b3 61 fe c2 eb c6                          .Q.a....
[00:00:00.930,003] <inf> mbedtls: ssl_msg.c:3231: handshake message: msglen = 24, type = 20, hslen = 24
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:4261: <= read record
[00:00:00.930,003] <inf> mbedtls: ssl_msg.c:0291: set_timer to 0 ms
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:8715: <= parse finished
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2353: => flush output
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2362: <= flush output
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:4525: client state: MBEDTLS_SSL_FLUSH_BUFFERS
[00:00:00.930,003] <wrn> mbedtls: ssl_tls12_client.c:3585: handshake: done
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2353: => flush output
[00:00:00.930,003] <wrn> mbedtls: ssl_msg.c:2362: <= flush output
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:4525: client state: MBEDTLS_SSL_HANDSHAKE_WRAPUP
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8468: => handshake wrapup
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8442: => handshake wrapup: final free
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8461: <= handshake wrapup: final free
[00:00:00.930,003] <inf> mbedtls: ssl_tls.c:8522: <= handshake wrapup
[00:00:00.930,003] <wrn> mbedtls: ssl_tls.c:4617: <= handshake
[00:00:00.930,003] <inf> net_lwm2m_engine: Connected, sock id 1
[00:00:00.930,003] <dbg> net_lwm2m_registry: lwm2m_engine_get: path:1/0/1/0, level 3, buf:0xe70fc1dc, buflen:4
[00:00:00.930,003] <dbg> net_lwm2m_rd_client: sm_send_registration: registration sent [23.97.187.154]
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:5782: => read
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2353: => flush output
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2362: <= flush output
[00:00:00.930,004] <inf> mbedtls: ssl_msg.c:0291: set_timer to 0 ms
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:4189: => read record
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2155: => fetch input
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2201: in_left: 0, nb_want: 13
[00:00:00.930,004] <inf> mbedtls: ssl_msg.c:2240: f_recv_timeout: 0 ms
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:5782: => read
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2353: => flush output
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2362: <= flush output
[00:00:00.930,004] <inf> mbedtls: ssl_msg.c:0291: set_timer to 0 ms
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:4189: => read record
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2155: => fetch input
[00:00:00.930,004] <wrn> mbedtls: ssl_msg.c:2201: in_left: 0, nb_want: 13
[00:00:00.930,004] <inf> mbedtls: ssl_msg.c:2240: f_recv_timeout: 0 ms

... goes on forever ...

this is the call stack

Thread 10 "zephyr.exe" hit Breakpoint 1, mbedtls_ssl_read (ssl=0x80bf364 <tls_contexts+228>, buf=0x0, len=0) at /home/brandon/fork/modules/crypto/mbedtls/library/ssl_msg.c:5782
5782        MBEDTLS_SSL_DEBUG_MSG(2, ("=> read"));
(gdb) where
#0  mbedtls_ssl_read (ssl=0x80bf364 <tls_contexts+228>, buf=0x0, len=0) at /home/brandon/fork/modules/crypto/mbedtls/library/ssl_msg.c:5782
#1  0x0805daa3 in ztls_socket_data_check (ctx=ctx@entry=0x80bf280 <tls_contexts>) at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets_tls.c:3078
#2  0x0805dc2f in ztls_poll_update_pollin (fd=<optimized out>, ctx=ctx@entry=0x80bf280 <tls_contexts>, pfd=pfd@entry=0x80c1090 <sock_fds>)
    at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets_tls.c:3157
#3  0x0805e16c in ztls_poll_update_ctx (ctx=ctx@entry=0x80bf280 <tls_contexts>, pfd=0x80c1090 <sock_fds>, pev=0xf40fc214)
    at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets_tls.c:3242
#4  0x0805e2f2 in tls_sock_ioctl_vmeth (obj=0x80bf280 <tls_contexts>, request=259, args=<optimized out>) at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets_tls.c:3726
#5  0x0804d90e in zvfs_fdtable_call_ioctl (vtable=0x80b4fa0 <tls_sock_fd_op_vtable>, obj=0x80bf280 <tls_contexts>, request=259) at /home/brandon/fork/zephyr/include/zephyr/sys/fdtable.h:189
#6  0x0805b465 in zsock_poll_internal (fds=0x80c1090 <sock_fds>, nfds=3, timeout=...) at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets.c:2442
#7  0x0805b57b in z_impl_zsock_poll (fds=0x80c1090 <sock_fds>, nfds=3, poll_timeout=410) at /home/brandon/fork/zephyr/subsys/net/lib/sockets/sockets.c:2489
#8  0x0807326a in zsock_poll (timeout=<optimized out>, nfds=3, fds=0x80c1090 <sock_fds>)
    at /home/brandon/fork/zephyr/samples/net/lwm2m_client/build/zephyr/include/generated/zephyr/syscalls/socket.h:416
#9  socket_loop (p1=0x0, p2=0x0, p3=0x0) at /home/brandon/fork/zephyr/subsys/net/lib/lwm2m/lwm2m_engine.c:830
#10 0x0804d790 in z_thread_entry (entry=0x80730e6 <socket_loop>, p1=0x0, p2=0x0, p3=0x0) at /home/brandon/fork/zephyr/lib/os/thread_entry.c:48
#11 0x080579e7 in posix_arch_thread_entry (pa_thread_status=0x80cf7bc <engine_thread_stack+2536>) at /home/brandon/fork/zephyr/arch/posix/core/thread.c:61
#12 0x080a5812 in nct_thread_starter (arg_el=0x80d2590) at /home/brandon/fork/zephyr/scripts/native_simulator//common/src/nct.c:364
#13 0xf7cfc157 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#14 0xf7d90e08 in clone3 () at ../sysdeps/unix/sysv/linux/i386/clone3.S:111

Environment (please complete the following information):

Additional context

It connects to my prod server fine without DTLS.

rlubos commented 1 month ago

@brandon-exact Please check https://github.com/zephyrproject-rtos/zephyr/pull/75931

brandon-exact commented 1 month ago

that fixed it!

rlubos commented 1 month ago

@brandon-exact Let's keep the issue open until the fix is merged.

brandon-exact commented 1 month ago

Thanks!