espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.34k stars 7.2k forks source link

examples/mesh/ip_internal_network: creation of corrupt mesh_netif_driver (IDFGH-13311) #14236

Open vjgriswold opened 1 month ago

vjgriswold commented 1 month ago

Answers checklist.

General issue report

In mesh_netif.c, the data structure mesh_netif_driver attempts to represent the ESP-WIFI-MESH driver control block. In fact, the declaration of mesh_netif_driver is in fatal conflict with the declaration of wifi_netif_driver in wifi_netif.c. Specifically, the fields wifi_if and sta_mac_addr occupy the same memory location in the control block.

Because ESP-WIFI-MESH 'netif' structures are often passed through esp_wifi APIs, these conflicting field declarations result in the first 4 octets of the ESP32's MAC address being used as the wifi_if index, leading in disastrous runtime results (which have been observed during testing of the ip_internal_network).

While it would at first appear that adding wifi_if to the mesh_netif_driver structure would resolve the conflict, the result is still not correct. In fact, there is no public declaration of the actual ESP-WIFI-MESH driver control block, so any creation or manipulation of the structure within mesh_netif.c can not be assured to be conflict-free with other fields within the actual structure.

For WiFi purposes, not only is the full declaration of the driver control block visible, but the function esp_wifi_create_if_driver() is available to reliably create and initialize the structure. No such public declarations or utility functions appear available to safely create an ESP-WIFI-MESH driver control block.

zhangyanjiaoesp commented 1 month ago

@vjgriswold Thanks for report, we will check it ASAP.

zhangyanjiaoesp commented 3 weeks ago

@vjgriswold In the mesh_netif.c file, although the driver parameter passed to esp_netif_attach() function is of type mesh_netif_driver_t, only the base part of mesh_netif_driver_t is read by esp_netif_attach(). Therefore, the sta_mac_addr field does not affect the wifi_if field. image image

In the ip_internal_network example, the purpose of the mesh_netif.c is to redefine the transmit function of the netif driver, thus the nodes can communicate using low level mesh send/receive API to exchange data. However, in reality the ESP-WIFI-MESH software stack is built atop the Wi-Fi Driver. So there is no need to public the mesh_netif.c like wifi_netif.c https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/network/esp-wifi-mesh.html#esp-wifi-mesh-programming-model image

vjgriswold commented 2 weeks ago

The problem is not with esp_netif_attach(), which knows nothing about WiFi STA vs AP configuration, but with all the other functions which do distinguish between STA vs AP and use the wifi_if field to make the distinction. These functions include: wifi_transmit(), wifi_transmit_wrap(), esp_wifi_create_if_driver(), esp_wifi_destroy_if_driver(), esp_wifi_get_if_mac(), esp_wifi_is_if_ready_when_started(), and esp_wifi_register_if_rxb().

Each of these functions is subject to invalid memory access faults when the field within the device control block is used for any other purpose besides wifi_if. This has been observed numerous times during operation of the ESP-WIFI-MESH sample program.

If the Mesh device control block is absolutely identical to the WiFi device control block, the example code must still be adjusted so that sta_mac_addr does not sit on top of the same memory location as does wifi_if.