zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.85k stars 6.6k forks source link

esp_at: large DTLS handshake messages fail reassembly in passive mode (`AT+CIPRECVTYPE=1`) #77993

Open hasheddan opened 2 months ago

hasheddan commented 2 months ago

Describe the bug

When interacting with a DTLS server using the esp_at WiFi driver with CONFIG_WIFI_ESP_AT_PASSIVE_MODE=y, if the size of a reassembled DTLS handshake message exceeds 2048, mbedtls will frequently fail to complete the handshake because the portion of the UDP datagram which contains the second fragment will frequently be less than that of the DTLS message. This causes mbedtls to discard the record with a message like the following:

[00:06:33.144,592] <err> mbedtls: WEST_TOPDIR/modules/crypto/mbedtls/library/ssl_msg.c:3895: Datagram of length 743 too small to contain record of advertised length 849.

This occurs because CIPRECVDATA_MAX_LEN is hard-coded to ESP_MTU, which is hard-coded to 2048, so if the sum of the fragments exceeds 2048 they won't both be able to be read in a single operation by mbedtls.

Consider the following sequence of datagrams from the server: image

The first datagram containing the Server Hello has length 80: image

The second containing the first fragment of the Certificate has length 1225: image

Because CIPRECVDATA_MAX_LEN is 2048, we only have room for 743 bytes (2048 - 80 - 1225 = 743), but the first DTLS message (the second fragment) in the datagram indicates in its header that it alone is 849 bytes (836 bytes + 13 header bytes):

image

To Reproduce

Connect to a DTLS server that fragments records during the handshake with CONFIG_WIFI_ESP_AT_PASSIVE_MODE=y set on client.

The following config will assist in observing the behavior via logging:

CONFIG_WIFI_LOG_LEVEL_DBG=n
CONFIG_NET_SOCKETS_LOG_LEVEL_DBG=y
CONFIG_MBEDTLS_LOG_LEVEL_DBG=y
CONFIG_MBEDTLS_DEBUG=y

Expected behavior

Manually increasing the CIPRECVDATA_MAX_LEN allows the handshake to complete successfully (I set to 4096), so one option would be to expose it as a user configurable value. Alternatively, higher levels of the stack may be able to handle ensuring that multiple records are identified.

Impact

Using the esp_at driver in passive mode is not possible in some scenarios with some servers.

Logs and console output

Shown above.

Environment (please complete the following information):

hasheddan commented 2 months ago

Though a handshake can still be completed in active mode, in Zephyr v3.7.0 this precludes the use of Zephyr's DNS resolution as it uses recvfrom, which is not implemented for active mode. However, @mniestroj has implemented it for active mode on main: https://github.com/zephyrproject-rtos/zephyr/commit/460b111fb42bd45a25fffc5c1752b9032d881293

github-actions[bot] commented 23 minutes ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.