esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
292 stars 36 forks source link

ESP8266: Wifi connectivity stuck trying to `wifi_apply_hostname` #3402

Open rafalyesware opened 2 years ago

rafalyesware commented 2 years ago

The problem

I have a couple of cheap Wemos D1 mini clones that frequently get in a state which they can't recover from -- they error out trying set the hostname (via a DHCP renew, if I'm reading the code correctly) over and over again and then give up and eventually time out, then re-scan and fall back into the same pattern. Eventually, they may recover but I haven't found the clue to when and how.

Here's the loop of log messages I see when this happens:

[I][wifi:248]: WiFi Connecting to '<Wifi SSID>'...
[W][wifi_esp8266:212]: wifi_apply_hostname_(esphome-tester): lwIP error -16 on interface st (index 0)
[W][wifi:536]: Error while connecting to network.
[I][wifi:248]: WiFi Connecting to '<Wifi SSID>'...
[W][wifi_esp8266:212]: wifi_apply_hostname_(esphome-tester): lwIP error -16 on interface st (index 0)
[W][wifi:530]: Timeout while connecting to WiFi.
[D][wifi:373]: Starting scan...
...

The DHCP server is a DD-WRT router using dnsmasq for DHCP & DNS. Here's a capture of a snippet of the traffic for the ESP's MAC address at the router when it's in this state:

20:34:24.654157 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 255, id 32, offset 0, flags [none], proto UDP (17), length 336)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from es:pd:ev:ma:ca:dr, length 308, xid 0x6a049745, Flags [none] (0x0000)
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            MSZ Option 57, length 2: 1500
            Parameter-Request Option 55, length 5:
              Subnet-Mask, Default-Gateway, BR, Domain-Name-Server
              NTP
            END Option 255, length 0
            PAD Option 0, length 0, occurs 53
20:34:24.655317 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 32750, offset 0, flags [none], proto UDP (17), length 328)
    192.168.1.1.67 > 192.168.1.111.68: [bad udp cksum 0x8506 -> 0x0b74!] BOOTP/DHCP, Reply, length 300, xid 0x6a049745, Flags [none] (0x0000)
          Your-IP 192.168.1.111
          Server-IP 192.168.1.1
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 192.168.1.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 192.168.1.255
            Domain-Name-Server Option 6, length 4: 192.168.1.1
            Default-Gateway Option 3, length 4: 192.168.1.1
            END Option 255, length 0
            PAD Option 0, length 0, occurs 8
20:34:29.662624 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:30.662091 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:31.661562 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:32.678522 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 255, id 33, offset 0, flags [none], proto UDP (17), length 336)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from es:pd:ev:ma:ca:dr, length 308, xid 0x6a049745, Flags [none] (0x0000)
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            MSZ Option 57, length 2: 1500
            Parameter-Request Option 55, length 5:
              Subnet-Mask, Default-Gateway, BR, Domain-Name-Server
              NTP
            END Option 255, length 0
            PAD Option 0, length 0, occurs 53
20:34:32.679674 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 33432, offset 0, flags [none], proto UDP (17), length 328)
    192.168.1.1.67 > 192.168.1.111.68: [bad udp cksum 0x8506 -> 0x0b74!] BOOTP/DHCP, Reply, length 300, xid 0x6a049745, Flags [none] (0x0000)
          Your-IP 192.168.1.111
          Server-IP 192.168.1.1
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 192.168.1.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 192.168.1.255
            Domain-Name-Server Option 6, length 4: 192.168.1.1
            Default-Gateway Option 3, length 4: 192.168.1.1
            END Option 255, length 0
            PAD Option 0, length 0, occurs 8
20:34:37.678360 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:38.677828 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:39.677295 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:48.676292 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 255, id 34, offset 0, flags [none], proto UDP (17), length 336)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from es:pd:ev:ma:ca:dr, length 308, xid 0x6a049745, Flags [none] (0x0000)
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            MSZ Option 57, length 2: 1500
            Parameter-Request Option 55, length 5:
              Subnet-Mask, Default-Gateway, BR, Domain-Name-Server
              NTP
            END Option 255, length 0
            PAD Option 0, length 0, occurs 53
20:34:48.677416 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 35019, offset 0, flags [none], proto UDP (17), length 328)
    192.168.1.1.67 > 192.168.1.111.68: [bad udp cksum 0x8506 -> 0x0b74!] BOOTP/DHCP, Reply, length 300, xid 0x6a049745, Flags [none] (0x0000)
          Your-IP 192.168.1.111
          Server-IP 192.168.1.1
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 192.168.1.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 192.168.1.255
            Domain-Name-Server Option 6, length 4: 192.168.1.1
            Default-Gateway Option 3, length 4: 192.168.1.1
            END Option 255, length 0
            PAD Option 0, length 0, occurs 8
20:34:53.679850 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
20:34:54.679318 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28

The TL;DR appears to be that when it's in that state it does a DHCP DISCOVER, gets a DHCP OFFER back and then nada.

When it works, it looks like this:

21:09:50.866246 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
21:09:51.865716 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
21:09:52.865179 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.1, length 28
21:09:57.277085 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, 802.3, length 60: LLC, dsap Null (0x00) Individual, ssap Null (0x00) Response, ctrl 0xaf: Unnumbered, xid
, Flags [Response], length 46: 01 00
21:09:57.299160 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 255, id 51, offset 0, flags [none], proto UDP (17), length 336)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from es:pd:ev:ma:ca:dr, length 308, xid 0x543a1415, Flags [none] (0x0000)
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Discover
            MSZ Option 57, length 2: 1500
            Parameter-Request Option 55, length 5:
              Subnet-Mask, Default-Gateway, BR, Domain-Name-Server
              NTP
            END Option 255, length 0
            PAD Option 0, length 0, occurs 53
21:09:57.300220 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 41908, offset 0, flags [none], proto UDP (17), length 328)
    192.168.1.1.67 > 192.168.1.111.68: [bad udp cksum 0x8506 -> 0xa46e!] BOOTP/DHCP, Reply, length 300, xid 0x543a1415, Flags [none] (0x0000)
          Your-IP 192.168.1.111
          Server-IP 192.168.1.1
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Offer
            Server-ID Option 54, length 4: 192.168.1.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 192.168.1.255
            Domain-Name-Server Option 6, length 4: 192.168.1.1
            Default-Gateway Option 3, length 4: 192.168.1.1
            END Option 255, length 0
            PAD Option 0, length 0, occurs 8
21:09:57.307992 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 350: (tos 0x0, ttl 255, id 52, offset 0, flags [none], proto UDP (17), length 336)
    0.0.0.0.68 > 255.255.255.255.67: [udp sum ok] BOOTP/DHCP, Request from es:pd:ev:ma:ca:dr, length 308, xid 0x543a1415, Flags [none] (0x0000)
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: Request
            MSZ Option 57, length 2: 1500
            Requested-IP Option 50, length 4: 192.168.1.111
            Server-ID Option 54, length 4: 192.168.1.1
            Parameter-Request Option 55, length 5:
              Subnet-Mask, Default-Gateway, BR, Domain-Name-Server
              NTP
            Hostname Option 12, length 14: "esphome-tester"
            END Option 255, length 0
            PAD Option 0, length 0, occurs 25
21:09:57.399332 ro:ut:er:ma:ca:dr > es:pd:ev:ma:ca:dr, ethertype IPv4 (0x0800), length 342: (tos 0xc0, ttl 64, id 41913, offset 0, flags [none], proto UDP (17), length 328)
    192.168.1.1.67 > 192.168.1.111.68: [bad udp cksum 0x8506 -> 0xa16e!] BOOTP/DHCP, Reply, length 300, xid 0x543a1415, Flags [none] (0x0000)
          Your-IP 192.168.1.111
          Server-IP 192.168.1.1
          Client-Ethernet-Address es:pd:ev:ma:ca:dr
          Vendor-rfc1048 Extensions
            Magic Cookie 0x63825363
            DHCP-Message Option 53, length 1: ACK
            Server-ID Option 54, length 4: 192.168.1.1
            Lease-Time Option 51, length 4: 86400
            RN Option 58, length 4: 43200
            RB Option 59, length 4: 75600
            Subnet-Mask Option 1, length 4: 255.255.255.0
            BR Option 28, length 4: 192.168.1.255
            Domain-Name-Server Option 6, length 4: 192.168.1.1
            Default-Gateway Option 3, length 4: 192.168.1.1
            END Option 255, length 0
            PAD Option 0, length 0, occurs 8
21:09:57.402875 es:pd:ev:ma:ca:dr > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 192.168.1.111 tell 192.168.1.111, length 46
21:09:57.432083 es:pd:ev:ma:ca:dr > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 1, id 53, offset 0, flags [none], proto IGMP (2), length 32, options (RA))
    192.168.1.111 > 224.0.0.251: igmp v2 report 224.0.0.251
21:09:57.477028 es:pd:ev:ma:ca:dr > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 60: (tos 0x0, ttl 1, id 54, offset 0, flags [none], proto IGMP (2), length 32, options (RA))
    192.168.1.111 > 224.0.0.251: igmp v2 report 224.0.0.251
21:09:57.623403 es:pd:ev:ma:ca:dr > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 116: (tos 0x0, ttl 255, id 55, offset 0, flags [none], proto UDP (17), length 102)
    192.168.1.111.5353 > 224.0.0.251.5353: [udp sum ok] 0 [1n] ANY (QU)? esphome-tester.local. ns: esphome-tester.local. [2m] A 192.168.1.111 (74)
21:09:57.677271 es:pd:ev:ma:ca:dr > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 304: (tos 0x0, ttl 255, id 56, offset 0, flags [none], proto UDP (17), length 290)
    192.168.1.111.5353 > 224.0.0.251.5353: [udp sum ok] 0 [2n] [2au] ANY (QU)? e
sphome-tester._esphomelib._tcp.local. ns: _esphomelib._tcp.local. [1h15m] PTR esphome-tester._esphomelib._tcp.local., esphome-tester._esphomelib._tcp.local. [1h15m] SRV esphome-tester.local.:6053 0 0 ar: esphome-tester._esphomelib._tcp.local. [1h15m] TXT "board=esp01_1m" "platform=ESP8266" "mac=<espmacaddr>" "version=2022.6.2", esphome-tester.local. [2m] A 192.168.1.111 (262)

IOW, now after the OFFER, the ESP sends a REQUEST with the hostname and the router sends back an ACK. Behavior matches the point in the networking code where the hostname is set, but I have no idea how to track down what the error is in the ESP / Arduino Wifi / Networking code since it appears that dhcp_renew which is actually causing the error is part of the (closed source?) Espressif SDK?

Which version of ESPHome has the issue?

2022.6.2

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2022.6.7 (core) 2022.06.0 (supervisor)

What platform are you using?

ESP8266

Board

Wemos D1 clone (https://www.amazon.com/dp/B07V84VWSM?_encoding=UTF8&psc=1&ref_=cm_sw_r_cp_ud_dp_84K8CM5B88XMAZWKPY2V)

Component causing the issue

Wifi

Example YAML snippet

substitutions:
  name: "esphome-tester"
  friendly_name: ESPHome Tester

esphome:
  name: ${name}

esp8266:
  board: esp01_1m

# Enable logging
logger:
  # ESP8266 only - disable serial port logging, as the HeatPump component
  # needs the sole hardware UART on the ESP8266
  baud_rate: 0

# Enable Home Assistant API
api:

ota:

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password
  #manual_ip:
  #  static_ip: 192.168.1.111
  #  gateway: 192.168.1.1
  #  subnet: 255.255.255.0
  #  dns1: 192.168.1.1

  # Enable fallback hotspot (captive portal) in case wifi connection fails
  ap:
    ssid: "${friendly_name} Fallback"
    password: !secret fallback_password

captive_portal:

# Enable Web server.
web_server:
  port: 80

  # Sync time with Home Assistant.
time:
  - platform: homeassistant
    id: homeassistant_time

# Text sensors with general information.
text_sensor:
  # Expose ESPHome version as sensor.
  - platform: version
    name: ${name} ESPHome Version
  # Expose WiFi information as sensors.
  - platform: wifi_info
    ip_address:
      name: ${name} IP
    ssid:
      name: ${name} SSID
    bssid:
      name: ${name} BSSID

# Sensors with general information.
sensor:
  # Uptime sensor.
  - platform: uptime
    name: ${name} Uptime

  # WiFi Signal sensor.
  - platform: wifi_signal
    name: ${name} WiFi Signal
    update_interval: 60s

external_components:
  - source: github://geoffdavis/esphome-mitsubishiheatpump

climate:
  - platform: mitsubishi_heatpump
    name: "${friendly_name}"

I believe I saw this even before installing any of the additional components / sensors / fameworks as well, will grab a clean device to test with a minimal config.


### Anything in the logs that might be useful for us?

```txt
See logs / tcpdump output in description.

Additional information

No response

rafalyesware commented 2 years ago

Reviewing the ticket, it definitely happened on a totally barebones ESPHome config (aka w/out even the time sync, nor any sensors / components / frameworks), since I still had serial logging enabled at that point, which I disabled when I configured the device for https://github.com/geoffdavis/esphome-mitsubishiheatpump

damianmoore commented 2 years ago

I'm getting this error too with an energy meter (pulse counting) built using a Wemos D1 connected to a DD-WRT Netgear WNDR3700v4. Worked fine for a few days but after that I'm lucky if I get energy data for a couple of hours each day. Using esphome 2022.6.2. I may try to upgrade my router or connect to a different access point. My Sonoff S26 smart plug is also having connection issues since I upgraded esphome but was fine with a version from I guess about a year ago.

martinecker commented 2 years ago

Running into this same issue. It just started today on a Wemos D1 hooked up to a Vista alarm system that ran fine for the last maybe four or five weeks. I didn't change anything in my setup (wifi, router, or esphome version/firmware). I think the Wemos D1 got bricked somehow. Going to try with another one.

Here's my esphome Wemos D1 log:

[09:26:02][I][wifi:248]: WiFi Connecting to '<ssid>'...
[09:26:02][W][wifi_esp8266:212]: wifi_apply_hostname_(esphome-vista-alarm): lwIP error -16 on interface st (index 0)
[09:26:02][W][wifi:536]: Error while connecting to network.
[09:26:02][I][wifi:248]: WiFi Connecting to '<ssid>...
[09:26:02][W][wifi_esp8266:212]: wifi_apply_hostname_(esphome-vista-alarm): lwIP error -16 on interface st (index 0)

Glancing at the lwIP code https://github.com/espressif/esp-lwip/blob/2.1.2-esp/src/include/lwip/err.h it appears that -16 is ERR_ARG=illegal argument. Hmm.

rafalyesware commented 2 years ago

FWIW, I just gave my ESP32 devices static IP addresses and things look 10x better now. All my issues were due to something timing out / erroring / being unhappy during the DHCP transaction(s).

On Tue, Sep 13, 2022 at 12:25 PM Martin Ecker @.***> wrote:

Running into this same issue. It just started today on a Wemos D1 hooked up to a Vista alarm system that ran fine for the last maybe four or five weeks. I didn't change anything in my setup (wifi, router, or esphome version/firmware). I think the Wemos D1 got bricked somehow. Going to try with another one.

— Reply to this email directly, view it on GitHub https://github.com/esphome/issues/issues/3402#issuecomment-1245652429, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABAM3FUIZGSORPMMNNETU73V6CTGHANCNFSM5ZYQOAUQ . You are receiving this because you authored the thread.Message ID: @.***>

-- RAFAL BONI | Engineering Team Lead e: @.***

Meet me here https://meet.yesware.com/me/rafalyesware

martinecker commented 2 years ago

Ah, thanks for that tip. I assigned a static IP in my router to the Wemos D1 and indeed it was immediately able to reconnect.

pumpagubben commented 2 years ago

Have same issues with my NodeMCU v3 of unknown origin with a ASUS AX58U router. Static ip via DHCP gave same error messages as original post. Then put the IP static in in Esphome config at connection got stable right away.

d-d commented 2 years ago

Had the same issue on a D1 uisng ESPHome version 2022.10.2. Fix was to disable "Wi-Fi Agile Multiband" based on this comment. Problem solved. https://github.com/esp8266/Arduino/issues/8299#issuecomment-966775021

bartowl commented 2 years ago

I have the same issue, it happened on a Sonoff Basic R1 (pretty old module with a lot of flashes over time). I just experimented moving away from tasmota to esphome and everything worked fine with 2022.8.0 firmware. I have updated to 2022.10.2 and it was still ok. Then I have added publish_initial_state: true on a binary sensor with multiple on_multi_click timings and it started. Got back to 2022.8.0, reverted publish_initial_state to false but this did not change anything. a boot looks more or like this:

[I][logger:243]: Log initialized
[C][status_led:014]: Setting up Status LED...
[C][ota:461]: There have been 0 suspected unsuccessful boot attempts.
[I][app:029]: Running through setup()...
[C][switch.gpio:011]: Setting up GPIO Switch 'Relay'...
[D][switch:017]: 'Relay' Turning OFF.
[D][switch:037]: 'Relay': Sending state OFF
[D][switch:017]: 'Relay' Turning OFF.
[D][binary_sensor:034]: 'Button1': Sending initial state OFF
[D][text_sensor:067]: 'Esphome Version': Sending state '2022.8.0 Nov 16 2022, 11:59:17'
[D][text_sensor:067]: 'Wifi MAC Address': Sending state '5C:CF:7F:xx:xx:xx'
[C][wifi:037]: Setting up WiFi...
[D][wifi:373]: Starting scan...
[D][sensor:126]: 'Uptime': Sending state 0.14600 s with 0 decimals of accuracy
[D][sensor:126]: 'Wifi Signal': Sending state 100.00000 % with 0 decimals of accuracy
[D][binary_sensor:034]: 'Switch1': Sending initial state ON
[D][wifi:388]: Found networks:
[I][wifi:431]: - 'Xxxxxx' (50:C7:BF:xx:xx:xx ▂▄▆█
[D][wifi:433]:     Channel: 11
[D][wifi:434]:     RSSI: -79 dB
[D][wifi:436]: - 'Zzzz' (12:C7:BF:xx:xx:xx) ▂▄▆█
[D][wifi:436]: - 'Yyyy' (02:C7:BF:xx:xx:xx) ▂▄▆█
[D][wifi:436]: - 'Yyyy' (02:C7:BF:xx:xx:xx) ▂▄▆█
[D][wifi:436]: - 'Zzzz' (12:C7:BF:xx:xx:xx) ▂▄▆█
[I][wifi:248]: WiFi Connecting to 'Xxxx'...
[W][wifi:530]: Timeout while connecting to WiFi.
[I][wifi:248]: WiFi Connecting to 'Xxxx'...
[W][wifi_esp8266:212]: wifi_apply_hostname_(xxxx): lwIP error -16 on interface st (index 0)
[W][wifi_esp8266:482]: Event: Disconnected ssid='Xxxx' bssid=50:C7:BF:xx:xx:xx reason='Association Leave'
[W][wifi:536]: Error while connecting to network.
[W][wifi:572]: Restarting WiFi adapter...
[D][sensor:126]: 'Wifi Signal': Sending state 100.00000 % with 0 decimals of accuracy
[D][wifi:373]: Starting scan...
[D][wifi:388]: Found networks:
[I][wifi:431]: - 'Xxxx' (50:C7:BF:xx:xx:xx) ▂▄▆█
[D][wifi:433]:     Channel: 11
[D][wifi:434]:     RSSI: -85 dB
[D][wifi:436]: - 'Yyyy' (02:C7:BF:xx:xx:xx) ▂▄▆█
[D][wifi:436]: - 'Zzzz' (12:C7:BF:xx:xx:xx) ▂▄▆█
[I][wifi:248]: WiFi Connecting to 'Xxxx'...
[W][wifi_esp8266:212]: wifi_apply_hostname_(xxxx): lwIP error -16 on interface st (index 0)
[W][wifi:530]: Timeout while connecting to WiFi.
[I][wifi:248]: WiFi Connecting to 'Xxxx'...
[W][wifi_esp8266:212]: wifi_apply_hostname_(xxxx): lwIP error -16 on interface st (index 0)
[W][wifi_esp8266:482]: Event: Disconnected ssid='Xxxx' bssid=50:C7:BF:xx:xx:xx reason='Association Leave'
[W][wifi:536]: Error while connecting to network.
[W][wifi:572]: Restarting WiFi adapter...

Other reasons that show are all possible, Token expired, Association Expired, Auth Expired and so on. It just looks like if the wifi chip got somehow damaged.

What is however interesting - after flashing back tasmota 9.5 it was able to connect and work just fine?!? I flashed the same binary to a Sonoff R2 laying around and it connected without problems as well. Seems some weird combination of chip and firmware/esphome verison...

What I also observed - during some boots, scan reports wery poor RSSI (-85) and for some other boots a bit better (-79)... This does not seem to be signal related though...

Edit: after trying to revert back to tasmota i have found out, that after longer running it also looses wifi and gets in 'trying to connect' loop, so it seems that those few additional flash operations might have either weared out the flash chip or caused some other kind of damage as it does not work back with tasmota. I will have to replace the module entirely :/

tetele commented 2 years ago

Same issue with multiple ESP8266 devices (NodeMCU clones, D1 Mini clones), with or without any functionality attached. Haven't had any issues with ESP32 devices.

All of them are complaining about not being able to apply hostname

[13:21:06][W][wifi_esp8266:212]: wifi_apply_hostname_(central-heating-actuator): lwIP error -16 on interface st (index 0)
[13:21:06][W][wifi:549]: Error while connecting to network.
[13:21:06][I][wifi:257]: WiFi Connecting to 'xxxxx'...
[13:21:06][W][wifi_esp8266:212]: wifi_apply_hostname_(central-heating-actuator): lwIP error -16 on interface st (index 0)

What's noticeable is that if I leave them powered on for long enough, sometimes they do connect to WiFi.

The WiFi is running on a pair of Asus XT8 units, the DHCP server is powered by PiHole v5.19.2.

SigSegFault commented 1 year ago

I'm having an exactly the same issue.

It seems unrelated to ESPHome at all. I've seen this problem with a brand new NodeMCU-v3 dev board with a factory firmware. Luckily I have two routers at home and the issue only exists with my Archer AX73. Everything works fine and blazingly fast with a good old TL-WR841N. I'm suspecting that it's some protocol issue. Probably with 802.11n that the good old TL-WR841N doesn't even support.

bhouszka commented 1 year ago

Hi, I have been struggling with this issue for a week now and I finally managed to solve it. Based on @SigSegFault 's comment I understood this is most probably a router issue. So I called Asus to help me fix it (I have Asus RT-AX55). After some experimenting, we (the support guy and I) figured out that:

  1. Smart connect in WiFi settings has to be disabled
  2. There has to be a different SSID set for each of the bands
  3. My Wemos D1 mini has to be configured to connect to 2.4 GHz SSID.

And it works like a charm. Now the question is could something be done in ESPHome to make ESP8266 support Smart Connect as it not cool to have two SSIDs :P

tetele commented 1 year ago

I have an Asus XT8 (RT-AX95Q), but I have different SSIDs for each band and each guest network. I was connecting to the 2.4GHz SSID and still had the issue.

bhouszka commented 1 year ago

Sorry to hear that, Something I was told to check (I did not need to though) is to set the the Wireless mode for 2.4Ghz to "Legacy". Maybe this will help?

leslieb256 commented 1 year ago

I got my devices working when I switched my Asus router to Wireless Mode = Legacy. Both my esphome device and my nest protects stopped being able access my 2.4Ghz network. I found this site that might have the explanation about what is happening: https://olimex.wordpress.com/2021/12/10/avoid-wifi-channel-12-13-14-when-working-with-esp-devices/

insanity67 commented 1 year ago

I'm almost a little embarrassed, but for me this behavior occurred because my DHCP range had no more free addresses...

gothma commented 6 months ago

The workaround to set the manual_ip in the wifi module also fixed the issue for me. Flashing the firmware to a brand new esp32 did not fix it.