espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.29k stars 7.2k forks source link

esp_netif_dhcpc_start_api returns no errors, but stuck in ESP_NETIF_DHCP_INIT state. (IDFGH-12833) #13800

Open karlp opened 3 months ago

karlp commented 3 months ago

Answers checklist.

IDF version.

5.1.2 (but same code in master and 5.2)

Espressif SoC revision.

ESP32

Operating System used.

Linux

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

Custom

Power Supply used.

External 5V

What is the expected behavior?

After creating a (ethernet) netif, I optionally apply DHCP client settings. Depending on how quickly the link comes up, the DHCP settings are not always applied. The esp_netif_dhcpc_start call returns ESP_OK, but it prints a debug log to "re-init" Here: https://github.com/espressif/esp-idf/blob/master/components/esp_netif/lwip/esp_netif_lwip.c#L1531-L1533

Unless I explicitly re-invoke dhcp start, my netif says permanently in the ESP_NETIF_DHCP_INIT state. I can externally plug/unplug the ethernet, and see that the stack recognises link up/down events, but DHCP will never request an address.

To summarize, expected behaviour is that

        ESP_ERROR_CHECK(esp_netif_dhcpc_start(netif));
        ESP_LOGI(TAG, "Applying DHCP");

Should either start DHCP client, or return an error. Currently, it returns a misleading "ESP_OK", with only a debug print.

What is the actual behavior?

Actual behaviour is that no error code is returned, but the DHCP client is not started.

Steps to reproduce.

The netif is created using code like this, which disables the automatic dhcp, because we may (depending on config) wish to use static IPs instead:

    esp_netif_config_t cfg = ESP_NETIF_DEFAULT_ETH();
    // now, these configs are const, so you can't customize them after creation. for... reasons..
    esp_netif_inherent_config_t _inherent_config;
    memcpy(&_inherent_config, cfg.base, sizeof(esp_netif_inherent_config_t));
    cfg.base = &_inherent_config;
    // Turn off DHCP, we can start the link, but we're not _doing_ anything until we know what we want to do.
    _inherent_config.flags &= ~(ESP_NETIF_IPV4_ONLY_FLAGS(ESP_NETIF_DHCP_CLIENT));

    esp_netif_t *netif = esp_netif_new(&cfg);
      // more code to glue in phy, attach, and start...

Afterwards, we use this code for turning on DHCP:

        ESP_ERROR_CHECK(esp_netif_dhcpc_start(netif));
        ESP_LOGI(TAG, "Applying DHCP");

We see this state of the netif not being up happen when we are removing and re-creating the netif around power down states. (We remove and destroy the netif to clear PM locks, as we are also powering off the PHY entirely)

Debug Logs.

(I edited my local esp-idf to make the two places that both log "dhcp client re init" distinguishable...)

log 1

Debug log of creating and "applying" DHCP ``` I (10:54:12.733) task_network: <>> D (10:54:12.735) esp_netif_objects: esp_netif_add_to_list 0x3ffd3d40 D (10:54:12.736) esp_netif_objects: esp_netif_add_to_list netif added successfully (total netifs: 1) D (10:54:12.738) esp_netif_lwip: check: remote, if=0x0 fn=0x400fed98 D (10:54:12.742) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (10:54:12.749) eth_phy_802_3: Found PHY address: 0 D (10:54:12.769) esp_eth: new ethernet driver @0x3ffd5074 I (10:54:12.770) esp_eth.netif.netif_glue: 0c:8b:95:49:ef:3b D (10:54:12.771) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400feda8 D (10:54:12.775) esp_netif_lwip: call api in lwip: ret=0x0, give sem I (10:54:12.776) esp_eth.netif.netif_glue: ethernet attached to netif D (10:54:12.778) event: running post ETH_EVENT:0 with handler 0x400dc8c0 and context 0x3ffd37d0 on loop 0x3ffcae3c D (10:54:12.779) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ff8bc D (10:54:12.780) event: running post ETH_EVENT:0 with handler 0x40102880 and context 0x3ffdffdc on loop 0x3ffcae3c D (10:54:12.785) esp_netif_lwip: esp_netif_set_hostname_api esp_netif:0x3ffd3d40 hostname MAREL-M1200-dd40f4 D (10:54:12.786) esp_eth.netif.netif_glue: eth_action_start: 0x3ffd3534, 0x3f41c9f8, 0, 0x3ffdfff4, 0x3ffd5074 D (10:54:12.788) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (10:54:12.791) esp_netif_handlers: esp_netif action has started with netif0x3ffd3d40 from event_id=0 D (10:54:12.794) esp_netif_lwip: set dns invalid dns D (10:54:12.796) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ff400 D (10:54:12.798) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ff78c D (10:54:12.802) esp_netif_lwip: esp_netif_start_api 0x3ffd3d40 D (10:54:12.805) esp_netif_lwip: esp_netif_get_hostname esp_netif:0x3ffd3d40 D (10:54:12.807) esp_netif_lwip: check: local, if=0x3ffd3d40 fn=0x401000f8 D (10:54:12.811) esp_netif_lwip: esp_netif_update_default_netif_lwip 0x3ffd3d40 D (10:54:12.811) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (10:54:12.814) event: running post ETH_EVENT:0 with handler 0x400e025c and context 0x3ffd4c1c on loop 0x3ffcae3c I (10:54:12.814) esp_netif_lwip: esp_netif_dhcpc_start_api esp_netif:0x3ffd3d40 I (10:54:12.815) net_connect: Ethernet event 0 I (10:54:12.818) esp_netif_lwip: dhcp client re init1 <<<<<<<<<<< here is the indication that it has "failed" D (10:54:12.822) esp_netif_lwip: call api in lwip: ret=0x0, give sem I (10:54:12.824) net_connect: Applying DHCP <<<<< my code after getting ESP_OK from dhcp_start... I (10:54:12.827) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (10:54:13.818) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init <<< my code polling the state of dhcp... D (10:54:14.780) event: running post ETH_EVENT:2 with handler 0x400dc8c0 and context 0x3ffd37d0 on loop 0x3ffcae3c D (10:54:14.781) event: running post ETH_EVENT:2 with handler 0x40102808 and context 0x3ffdf7f8 on loop 0x3ffcae3c D (10:54:14.785) esp_eth.netif.netif_glue: eth_action_connected: 0x3ffd3534, 0x3f41c9f8, 2, 0x3ffdfff4, 0x3ffd5074 D (10:54:14.786) esp_netif_handlers: esp_netif action connected with netif0x3ffd3d40 from event_id=2 D (10:54:14.789) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ff938 D (10:54:14.793) esp_netif_lwip: esp_netif_up_api esp_netif:0x3ffd3d40 D (10:54:14.794) esp_netif_lwip: check: local, if=0x3ffd3d40 fn=0x401000f8 D (10:54:14.798) esp_netif_lwip: esp_netif_update_default_netif_lwip 0x3ffd3d40 D (10:54:14.799) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (10:54:14.801) event: running post ETH_EVENT:2 with handler 0x400e025c and context 0x3ffd4c1c on loop 0x3ffcae3c I (10:54:14.803) net_connect: Ethernet event 2 I (10:54:14.807) net_connect: Ethernet Link Up <<<< my code indicating that link and phy level are working fine... I (10:54:14.818) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (10:54:15.081) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:15.106) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:15.128) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:15.406) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:15.706) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (10:54:15.817) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (10:54:15.839) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:16.006) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (10:54:16.816) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (10:54:17.091) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.106) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.139) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.583) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.723) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.724) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.771) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (10:54:17.772) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (10:54:17.815) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (10:54:18.206) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (10:54:18.815) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init I (10:54:19.814) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init I (10:54:20.813) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init <<<< continues forever, link up link down below>>>> ```

Log section 2

This snippet shows that link up/down has no impact on this after the fact, the dhcp client is simply not started.

Log of link up/down ``` I (11:01:27.438) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init <<< my code checking state of dhcp D (11:01:27.955) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:28.437) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:28.769) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:28.770) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:28.821) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:28.821) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:29.436) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:29.953) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:30.435) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:30.779) event: running post ETH_EVENT:3 with handler 0x400d90c0 and context 0x3ffdb378 on loop 0x3ffcae3c D (11:01:30.896) event: running post ETH_EVENT:3 with handler 0x400dc8c0 and context 0x3ffd37d0 on loop 0x3ffcae3c D (11:01:30.896) event: running post ETH_EVENT:3 with handler 0x401027cc and context 0x3ffdf844 on loop 0x3ffcae3c D (11:01:30.898) esp_eth.netif.netif_glue: eth_action_disconnected: 0x3ffd3534, 0x3f41c9f8, 3, 0x3ffdfff4, 0x3ffd5074 D (11:01:30.901) esp_netif_handlers: esp_netif action disconnected with netif0x3ffd3d40 from event_id=3 D (11:01:30.904) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ffad8 I (11:01:30.908) esp_netif_lwip: esp_netif_down_api esp_netif:0x3ffd3d40 D (11:01:30.910) esp_netif_lwip: check: local, if=0x3ffd3d40 fn=0x401000f8 D (11:01:30.913) esp_netif_lwip: esp_netif_update_default_netif_lwip 0x3ffd3d40 D (11:01:30.914) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (11:01:30.917) event: running post ETH_EVENT:3 with handler 0x400e025c and context 0x3ffd4c1c on loop 0x3ffcae3c I (11:01:30.919) net_connect: Ethernet event 3 I (11:01:30.921) net_connect: Link down <<< my code link handler print D (11:01:30.923) event: no handlers have been registered for event ESP_HTTP_SERVER_EVENT:8 posted to loop 0x3ffcae3c D (11:01:30.924) event: no handlers have been registered for event ESP_HTTP_SERVER_EVENT:1 posted to loop 0x3ffcae3c I (11:01:31.435) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:32.434) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:33.433) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:34.433) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:35.432) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:36.432) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:37.431) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init I (11:01:38.431) K: dhcpc: netif: 0x3ffd3d40 (DOWN) flags: 18 init D (11:01:38.780) event: running post ETH_EVENT:2 with handler 0x400dc8c0 and context 0x3ffd37d0 on loop 0x3ffcae3c D (11:01:38.781) event: running post ETH_EVENT:2 with handler 0x40102808 and context 0x3ffdf7f8 on loop 0x3ffcae3c D (11:01:38.783) esp_eth.netif.netif_glue: eth_action_connected: 0x3ffd3534, 0x3f41c9f8, 2, 0x3ffdfff4, 0x3ffd5074 D (11:01:38.786) esp_netif_handlers: esp_netif action connected with netif0x3ffd3d40 from event_id=2 D (11:01:38.788) esp_netif_lwip: check: remote, if=0x3ffd3d40 fn=0x400ff938 D (11:01:38.793) esp_netif_lwip: esp_netif_up_api esp_netif:0x3ffd3d40 D (11:01:38.794) esp_netif_lwip: check: local, if=0x3ffd3d40 fn=0x401000f8 D (11:01:38.799) esp_netif_lwip: esp_netif_update_default_netif_lwip 0x3ffd3d40 D (11:01:38.800) esp_netif_lwip: call api in lwip: ret=0x0, give sem D (11:01:38.802) event: running post ETH_EVENT:2 with handler 0x400e025c and context 0x3ffd4c1c on loop 0x3ffcae3c I (11:01:38.804) net_connect: Ethernet event 2 <<< my code link handler print I (11:01:38.807) net_connect: Ethernet Link Up D (11:01:39.006) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:39.306) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:39.430) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:39.606) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:39.906) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:39.954) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:40.429) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:40.771) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:40.771) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:40.821) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:40.822) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:41.006) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:41.428) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init D (11:01:41.771) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:41.772) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:41.821) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:41.822) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:41.955) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 D (11:01:42.106) esp_netif_lwip: esp_netif_get_ip_info esp_netif:0x3ffd3d40 I (11:01:42.427) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init I (11:01:43.427) K: dhcpc: netif: 0x3ffd3d40 (UP) flags: 18 init ```

More Information.

No response

karlp commented 3 months ago

My Workaround is to simply poll esp_netif_dhcpc_get_status and reapply IP settings (either static or DHCP) as long as the dhcpc status is INIT. For static IP configuration, the state needs to be in STOPPED, and for DHCP it needs to be in STARTED.

I should also note that when using static IPs, the DHCP client still remains stuck in INIT, but this causes no harm in that case.