esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
294 stars 38 forks source link

ESP8266 Various Connection Issues #455

Closed OttoWinter closed 5 years ago

OttoWinter commented 5 years ago

Operating environment/Installation (Hass.io/Docker/pip/etc.): Any ESP (ESP32/ESP8266, Board/Sonoff): ESP8266 Affected component:

The ESP8266 SDK has a problem. For unknown reasons, some ESP8266s decide to no longer connect to any WiFi network after various actions (most commonly uploading a new ESPHome version).

Error logs often look like this (with various reason set):

[01:42:41][W][wifi_esp8266:354]: Event: Disconnected ssid='XYZ' bssid=XY:ZA:BC:DE:F0 reason='Association Leave'
[01:42:41][W][wifi:400]: Error while connecting to network.
[01:42:41][W][wifi:431]: Restarting WiFi adapter...
[01:42:46][I][wifi:164]: WiFi Connecting to 'XYZ'...
[11:34:23][W][wifi_esp8266:354]: Event: Disconnected ssid='XYZ' bssid=XY:ZA:BC:DE:F0 reason='Beacon Timeout'

After a lot of debugging, I've determined it must be an issue in the ESP SDK provided by espressif or the interaction with that SDK through the arduino core. Another theory is that it's somehow connected to the ESP8266 toolchain.

Common "fixes":

Especially the first of these "fixes" shows that it must be an issue very deep somewhere - almost as if something like the size of the generated binary somehow triggers this problem.

This thread will serve as unifying the efforts and all bug reports here - and if someone with more knowledge of the ESP SDK/toolchain can help out that's also appreciated.

Previous threads: #432, #187, #163, #162, #152, #119, #82, #69, #438, #409, #286, #220, #206, #130, #43, #27, #1

Legoracers commented 5 years ago

I have the same problem with 2 sonoff basic's. Adding the web_server component to the .yaml file resolved the issue.

olealm commented 5 years ago

Lates "development" for me was trying to flash 5 ESP-01s's (v1.13.2). Same 'No network found!' problem as my Sonoff Basics. After commenting out all manual IP related lines (only), thus using DHCP instead, it works.. :

   wifi:
    ssid: "myssid"
    password: "myssidpassword"
    #manual_ip:
    #  static_ip: 10.x.x.92
    #  subnet: 255.255.255.0
    #  dns1: 10.x.x.1
    #  #dns2: 8.8.8.8
    #  gateway: 10.x.x.1
brandond commented 5 years ago

Would it be worthwhile to have folks share binaries of builds that won't connect to WiFi? Could be something weird like link order or alignment that would show up with a large enough dataset.

jcollie commented 5 years ago

Would it be worthwhile to have folks share binaries of builds that won't connect to WiFi? Could be something weird like link order or alignment that would show up with a large enough dataset.

That would be problematic since WiFi credentials, OTA passwords, API passwords, MQTT passwords and possibly other credentials are embedded in the binaries.

olealm commented 5 years ago

...have folks share binaries of builds that won't connect to WiFi?

How easy would it be to extract your ssid and pwd from a compiled binary? Could be changed I guess, to something with same number of letters (if size of the generated binary could be a trigger).

jcollie commented 5 years ago

How easy would it be to extract your ssid and pwd from a compiled binary?

Trivial. On Linux the strings command will extract it. I'm sure similar utilities exist for Windows or Macs.

Could be changed I guess, to something with same number of letters (if size of the generated binary could be a trigger).

When dealing with heisenbugs it's probably best not to. There's probably some utility out there that will print out the structure of the resulting binary which should be good enough to start with.

kenmaples commented 5 years ago

Adding a second - platform: sntp on_time: section to my config causes my device (sonoff basic) to fail to authenticate to the network: [1B][0;33m[W][wifi:400]: Errgr while connecting to network.[1B][0m [1B][0;33m[W][wifi:431]: Restarting WiFi adapter...[1B][0m [1B][0;33m[W][wifi_esp8266:354]: Event: Disconnected ssid='IoT' bssid=xx:xxxx:xx:xx:xx reason='Authentication Failed'[1B][0m [00]H[00][00][00][00]ú[00][00][00][10]ÿ[1B][0;32m[I][wifi:164]: WiFi Connecting to 'Network'...[1B][0m

I was also able to generate [1B][0;36m[D][wifi:289]: No network found![1B][0m while trying to add this configuration.

What is strange though, is that when I disconnect 110 AC and power the unit via 3.3v, It connects to the network again running the same 'bad' configuration.

offlinehoster commented 5 years ago

Hi,

I had strange connection issues in the last few days with a WeMos NodeMCU v3.

After reading the "other" issue (https://github.com/esphome/issues/issues/187 ) I remembered that had similar issues with this board on the "Feinstaubsensor aka luftdaten.info" project.

A simple

esptool.py -p /dev/ttyUSB3 erase_flash

was working.

After that I could "re-flash" my Wemos by the esphome tool and it got connected back again to the network.

OttoWinter commented 5 years ago

Would it be worthwhile to have folks share binaries of builds that won't connect to WiFi? Could be something weird like link order or alignment that would show up with a large enough dataset.

@brandond In principle that could work. However,

@olealm @jcollie I've hidden your comments on strings utility from the thread. I don't have anything against the comment itself, but this thread could get quite long so I want to keep it focused.

I think there are two things we need to do:

What is strange though, is that when I disconnect 110 AC and power the unit via 3.3v, It connects to the network again running the same 'bad' configuration.

@kenmaples Thanks for the comment! Hmm yes power use might also be a contributing factor here.

@offlinehoster I don't think that's the same issue - I've seen people with this issue re-flash the same binary with esphome-flasher (which does erase the flash as well) and see the same problem. But it might be worth a try!

definitio commented 5 years ago

I have the same problem with 3 esp8266's. Changing fast_connect to false worked for 2 devices, but today I noticed that other esp8266 stops connecting to WI-Fi after connection lost, reboot and flashing with fast_connect: true don't fix that.

Just tried different arduino_versions: 2.5.2 and 2.4.2 with fast_connect: true - nothing changes (bootloop), 2.5.2 and 2.4.2 with fast_connect: false - works until reboot, then - bootloop, 2.3.0 - doesn't compile:

In file included from src/esphome/components/wifi/wifi_component.cpp:1:0:
src/esphome/components/wifi/wifi_component.h:194:35: error: 'System_Event_t' has not been declared
static void wifi_event_callback(System_Event_t *event);
^
In file included from src/esphome/components/wifi/wifi_component_esp8266.cpp:1:0:
src/esphome/components/wifi/wifi_component.h:194:35: error: 'System_Event_t' has not been declared
static void wifi_event_callback(System_Event_t *event);
^
In file included from src/esphome/components/wifi/wifi_component_esp32.cpp:1:0:
src/esphome/components/wifi/wifi_component.h:194:35: error: 'System_Event_t' has not been declared
static void wifi_event_callback(System_Event_t *event);
^
*** [.pioenvs/lroom_aquarium/src/esphome/components/wifi/wifi_component_esp32.cpp.o] Error 1
*** [.pioenvs/lroom_aquarium/src/esphome/components/wifi/wifi_component.cpp.o] Error 1
src/esphome/components/wifi/wifi_component_esp8266.cpp: In member function 'bool esphome::wifi::WiFiComponent::wifi_sta_connect_(esphome::wifi::WiFiAP)':
src/esphome/components/wifi/wifi_component_esp8266.cpp:179:10: error: 'struct station_config' has no member named 'threshold'
conf.threshold.authmode = AUTH_OPEN;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:181:10: error: 'struct station_config' has no member named 'threshold'
conf.threshold.authmode = AUTH_WPA_PSK;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:183:8: error: 'struct station_config' has no member named 'threshold'
conf.threshold.rssi = -127;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp: At global scope:
src/esphome/components/wifi/wifi_component_esp8266.cpp:332:6: error: prototype for 'void esphome::wifi::WiFiComponent::wifi_event_callback(System_Event_t*)' does not match any in class 'esphome::wifi::WiFiComponent'
void WiFiComponent::wifi_event_callback(System_Event_t *event) {
^
In file included from src/esphome/components/wifi/wifi_component_esp8266.cpp:1:0:
src/esphome/components/wifi/wifi_component.h:194:15: error: candidate is: static void esphome::wifi::WiFiComponent::wifi_event_callback(int*)
static void wifi_event_callback(System_Event_t *event);
^
src/esphome/components/wifi/wifi_component_esp8266.cpp: In member function 'void esphome::wifi::WiFiComponent::wifi_register_callbacks_()':
src/esphome/components/wifi/wifi_component_esp8266.cpp:413:111: error: invalid conversion from 'void (*)(int*)' to 'wifi_event_handler_cb_t {aka void (*)(_esp_event*)}' [-fpermissive]
void WiFiComponent::wifi_register_callbacks_() { wifi_set_event_handler_cb(&WiFiComponent::wifi_event_callback); }
^
In file included from src/esphome/components/wifi/wifi_component_esp8266.cpp:5:0:
/home/m/.platformio/packages/framework-arduinoespressif8266@1.20300.1/tools/sdk/include/user_interface.h:467:6: error:   initializing argument 1 of 'void wifi_set_event_handler_cb(wifi_event_handler_cb_t)' [-fpermissive]
void wifi_set_event_handler_cb(wifi_event_handler_cb_t cb);
^
src/esphome/components/wifi/wifi_component_esp8266.cpp: In member function 'bool esphome::wifi::WiFiComponent::wifi_scan_start_()':
src/esphome/components/wifi/wifi_component_esp8266.cpp:450:10: error: 'struct scan_config' has no member named 'scan_type'
config.scan_type = WIFI_SCAN_TYPE_ACTIVE;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:450:22: error: 'WIFI_SCAN_TYPE_ACTIVE' was not declared in this scope
config.scan_type = WIFI_SCAN_TYPE_ACTIVE;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:452:12: error: 'struct scan_config' has no member named 'scan_time'
config.scan_time.active.min = 100;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:453:12: error: 'struct scan_config' has no member named 'scan_time'
config.scan_time.active.max = 200;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:455:12: error: 'struct scan_config' has no member named 'scan_time'
config.scan_time.active.min = 400;
^
src/esphome/components/wifi/wifi_component_esp8266.cpp:456:12: error: 'struct scan_config' has no member named 'scan_time'
config.scan_time.active.max = 500;
^
*** [.pioenvs/lroom_aquarium/src/esphome/components/wifi/wifi_component_esp8266.cpp.o] Error 1

Removing random things from config and changing log level doesn't help.

Everything works again after flash erasing: esptool -p /dev/ttyUSB0 erase_flash.

OttoWinter commented 5 years ago

Everything works again after flash erasing: esptool -p /dev/ttyUSB0 erase_flash.

@definitio @offlinehoster I rechecked the code and it looks like esptool.py only erases the "write regions" (whatever that means).

@definitio Could you explain a bit more what you mean by everything works? Do all setups (with/without fast_connect) suddenly work? Also what was on the device before you flashed ESPHome (and how did you flash ESPHome)?

I'm gonna do some reading, but it might have something to do with the rf_cal sections. Kinda would make sense too.

(Same goes for other people, if you're experiencing the issue, check if erasing flash with esptool works)

definitio commented 5 years ago

Yes, it works with and without fast_connect. I used Tasmota on this device 1 year ago, it flashed by esphome config.yaml run.

OttoWinter commented 5 years ago

@definitio Ok, I've been looking at some other discussion around the webs, and I think I'm seeing something that could be improved (though I won't say it's the problem yet). Could you share if your Wifi settings changed since you had tasmota flashed on the device (meaning do you have a different Ssid/psk sincs then)? It might be that some data is incorrectly restored by the esp sdk.

definitio commented 5 years ago

No, I didn't change ssid and psk for 2-3 years. I also tested current configuration on other esp8266, that is only used with esphome - got same issue and fixed by erasing too.

OttoWinter commented 5 years ago

I've created https://github.com/esphome/esphome/pull/648 to test some things out.

I want to go about this methodically, so if someone has these WiFi problems, please try one of these and report your results (also if it doesn't work):

  1. Add the following to your config
esphome:
  on_boot:
  - lambda: 'ESP.eraseConfig();'

(if this works, it could be put at the end of the OTA process)

  1. Install the test PR here (https://github.com/esphome/esphome/pull/648) and see if that makes a difference

  2. Check the propositions in https://github.com/esphome/issues/issues/455#issuecomment-503483659 - namely does flashing the exact same binary to another device not exhibiting the problem suddenly make it get the bug too?

  3. Erase the chip with esptool.py (please not too many test this. If this does solve the issue then great - but ideally a fix would be in the code)

offlinehoster commented 5 years ago

I just flashed some esp8266...and now I see the issue

[17:57:30]scandone
[17:57:30][D][wifi:286]: Found networks:
[17:57:30][D][wifi:288]:   No network found!
[17:57:35][D][wifi:271]: Starting scan...
[17:57:41]scandone
[17:57:41][D][wifi:286]: Found networks:
[17:57:41][D][wifi:288]:   No network found!
[17:57:46][D][wifi:271]: Starting scan...
[17:57:52]scandone
[17:57:52][D][wifi:286]: Found networks:
[17:57:52][D][wifi:288]:   No network found!
[17:57:57][D][wifi:271]: Starting scan...
[17:58:02]scandone
[17:58:02][D][wifi:286]: Found networks:
[17:58:02][D][wifi:288]:   No network found!
[17:58:07][D][wifi:271]: Starting scan...
[17:58:13]scandone
[17:58:13][D][wifi:286]: Found networks:
[17:58:13][D][wifi:288]:   No network found!
[17:58:18][D][wifi:271]: Starting scan...
[17:58:24]scandone
[17:58:24][D][wifi:286]: Found networks:
[17:58:24][D][wifi:288]:   No network found!

Will try to use your test branch.

brandond commented 5 years ago

I recently came across a batch of smart plugs that wouldn't connect with fast_connect: true. If this sounds like the same issue I'd be glad to test on them.

kenmaples commented 5 years ago

I added the on_boot with no success I manually replaced the files from the PR without success I have been testing with two units and the same binary, though not today. The same build was causing an issue on both. esptool erase_flash did not improve anything.

Powered via 110v:

[D][wifi:272]:` Starting scan... 
[D][wifi:287]: Found networks:
[D][wifi:289]:   No network found!
[D][wifi:287]: Found networks:
[D][wifi:289]:   No network `found!

Powered via 3.3v

[D][wifi:287]: Found networks:
[D][wifi:326]: - 'REMOVED' (xx:xx:xx:xx:xx:xx) ▂▄▆█
[D][wifi:326]: - 'REMOVED' (xx:xx:xx:xx:xx:xx) ▂▄▆█
[W][wifi:331]: No matching network found!
[D][wifi:272]: Starting scan...
[D][wifi:287]: Found networks:
[I][wifi:322]: - 'REMOVED' (xx:xx:xx:xx:xx:xx) ▂▄▆█
[D][wifi:323]:     Channel: 6
[D][wifi:324]:     RSSI: -29 dB
[D][wifi:326]: - 'REMOVED' (xx:xx:xx:xx:xx:xx) ▂▄▆█
[D][wifi:326]: - 'REMOVED' (xx:xx:xx:xx:xx:xx) ▂▄▆█
[I][wifi:164]: WiFi Connecting to 'REMOVED'...
[I][wifi:380]: WiFi connected!
[C][wifi:254]:   SSID: 'REMOVED'
[C][wifi:255]:   IP Address:

I get a lot of 'feedback' when connecting to tx\rx pins when running on 110v. Not sure if that is normal, or just me doing it wrong. Actual output:

[D][wifi:289]:   No network found!                                             `▒▒b▒▒H▒▒▒@▒▒▒▒▒▒@▒▒\▒t▒▒▒▒▒▒H▒D▒▒ ▒@l (▒ ▒▒▒▒▒▒L▒▒▒▒▒▒▒▒▒▒▒▒▒PuTT@▒D▒H▒▒j▒▒▒▒▒@▒▒▒▒▒H▒▒▒▒H▒▒▒▒▒▒▒▒( ▒▒▒`▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒HI▒ ▒▒▒d@▒H▒▒▒▒d`▒▒▒▒ ▒▒(@V▒▒▒▒D ▒`▒▒▒▒D▒@▒@▒▒@[D][wifi:272]: Starting scan...
@▒▒▒▒ ▒▒▒▒▒▒ ▒`▒▒▒▒▒▒▒▒▒▒▒▒p▒▒▒H▒▒▒▒0▒▒▒▒▒▒▒▒▒▒▒▒▒h▒▒▒▒▒▒▒▒▒▒▒▒▒H▒▒^H ▒▒▒▒h▒▒▒`t▒▒@▒▒▒BD@▒ `▒`▒▒▒ ▒▒▒▒▒dH▒▒▒▒▒▒▒@▒▒▒▒▒▒▒▒▒▒`▒▒▒ ▒▒▒l▒▒▒▒l▒ ▒`▒▒H▒ H▒▒▒▒▒▒`▒▒▒▒▒▒▒▒▒▒▒bH▒▒▒▒▒t▒▒▒▒▒▒▒▒▒▒▒▒▒▒H@▒H▒▒▒▒▒▒▒▒▒x▒▒▒▒▒▒▒[D][wifi:287]: Found networks:
[D][wmfi:289]:   No network found!

Commenting out a few values in my config:

time:
 - platform: sntp
   on_time:
     - seconds: 0
       minutes: 30
       hours: 3
       #days_of_week: MON-FRI
       then:
         - switch.turn_on: relay
         - delay: 40min
         - switch.turn_off: relay
#- platform: sntp
#   on_time:
     - seconds: 0
#       minutes: 0
       hours: 2
       then:
#         - switch.turn_off: relay

gets me to a partial network scan. The error is now:


[W][wifi_esp8266:354]: Event: Disconnected ssid='IoT' bssid=00:00:00:00:00:00 reason='AP Not Found'
[W][wifi:400]: Error while connecting to network.
[W][wifi:431]: Restarting WiFi adapter...

and changing it to

time:
 - platform: sntp
   on_time:
     - seconds: 0
       minutes: 30
       hours: 3
       #days_of_week: MON-FRI
       then:
         - switch.turn_on: relay
         - delay: 40min
         - switch.turn_off: relay
#- platform: sntp
#   on_time:
#     - seconds: 0
#       minutes: 0
#       hours: 2
#       then:
#         - switch.turn_off: relay

gets me a functioning device again.

Anonym-tsk commented 5 years ago

I tried ESP.eraseConfig(), i tried esptool.py erase_flash, i tried power_save_mode, i tried OTA from espurna and clean flash. Only removing binary_sensor helped.

OttoWinter commented 5 years ago

@Anonym-tsk Thanks! Good to know erasing flash is not the solution to all problems :( @kenmaples Same for you, thanks for trying out some solutions - this error is just so strange :( @brandond Might be interesting, yes - if you could post some very verbose logs somewhere (better in a pastebin) that would be awesome. Although I don't think fast_connect is necessarily the issue we're seeing here.

I'm mainly looking for ways to consistently recreate the issue on my side now (haven't been able to do so yet). If someone can share a full YAML file in a pastebin that exhibits the behavior and say what device is doing so that would be great. Of course don't include your wifi credentials, so try setting up a WiFi hotspot with your phone (or whatever else) with test credentials that you can share.

hvddrift commented 5 years ago

Would the possibility of the wifi setup be a factor? For example I have the issue stated above, every time I upgrade ota my sonoff basics won’t connect. I have to disconnect them and physically flash them. I am watching this for a fix, however could there be another common factor such as wifi systems?

For example I am on a unifi system.

kenmaples commented 5 years ago

Here is my test YAML - https://pastebin.com/cJ6yJu0Y I am working with a Sonoff Basic R2 v1.0 Again, this config runs when powered by 3.3v, just not 110v.

I am also running unifi, so I grabbed an old Asus router to test on and I see the same results.

OttoWinter commented 5 years ago

@hvddrift Yes, the form I'm preparing will also include that as a question 👍 @kenmaples Tried to test that config with a NodeMCU (only ESP8266 I have around me right now) and was not able to reproduce, will try with a Sonoff when I get access to one again.

wischwien commented 5 years ago

Hi Mr. Winter i have 2 unused Sonoff 's (rf , th16) and live in vienna. if i can help i can lend u for testing,,

w2cker commented 5 years ago

When I reflash a probematic Sonoff with tasmota. The tasmota setting are still on the Sonoff. Even after multiple times flashing esphome. Can this be a part of the connection problem?

ASchneiderBR commented 5 years ago

Hello everyone.

Here are two projects that was sent OTA on a previously flashed Sonoff Basic using Tasmota. About 4 devices did not reconnected to the network after flashing the new ESPHome code:

https://pastebin.com/d8jffYvz https://pastebin.com/Q2mhfast

Using substitutions.

Thanks!

qoobaa commented 5 years ago

I get "AP Not Found" until I remove binary_sensor - this is the only way to make Sonoff Basic work correctly on 1.13.6.

lord-carlos commented 5 years ago

On Sonoff Basic it works for me with logging disabled or error. With default or warn it does not connect any more. Kinda wish there was a warning in the Sonoff Basic wiki article 🗯💢

Remlas commented 5 years ago

I can confirm that it's related somehow to binary_sensor. I'm on Sonoff Basic. When I removed binary_sensor name and set id I had "AP Not Found" (when it was on list, lol) and bssid was 00:...:00.

Gonna check changing logging level

EDIT: I confirm, changing log level allows me to get wifi connection working and now I can remove binary_sensor name and set it's id (to make it as internal).

My config:

esphome:
  name: sonoff_fan
  platform: ESP8266
  board: esp01_1m

wifi:
  networks:
  - ssid: "ap1"
    password: "pwd12345678"
  - ssid: "ap2"
    password: "pwd12345678"
  manual_ip:
    static_ip: 192.168.1.133
    gateway: 192.168.1.1
    subnet: 255.255.255.0

# Enable logging
logger:
  level: ERROR

# Enable Home Assistant API
api:
  password: "a1-LANGUID-thing8-mashes-a-journey"

ota:
  password: "a1-LANGUID-thing8-mashes-a-journey"

binary_sensor:
  - platform: gpio
    id: "button_1"
    pin:
      number: GPIO0
      mode: INPUT_PULLUP
      inverted: True
    on_press:
      - fan.toggle: "fan_1"

output:
  - platform: gpio
    pin: GPIO12
    id: power_relay
  - platform: gpio
    pin: GPIO14
    id: speed_relay

status_led:
  pin:
    number: GPIO13
    inverted: yes

fan:
  - platform: binary
    name: "Wiatrak"
    id: "fan_1"
    output: power_relay
OttoWinter commented 5 years ago

I can confirm that it's related somehow to binary_sensor. I'm on Sonoff Basic.

See the initial post - it's not really directly related to binary sensor. It just happens to cause issues.

When I removed binary_sensor name and set id I had "AP Not Found" (when it was on list, lol) and bssid was 00:...:00.

That AP not found is generated when the ESP SDK core doesn't find the network anymore - basically two scans always take place for connection (because the ESP SDK doesn't allow for low enough access):

  1. One for the scan list - this is what you see. Based on this the CONNECT call is started in the ESP SDK
  2. The CONNECT then does its own scan to find more info about the network that's required to connect (via a probe request). If that fails, you see that message.
mr-sneezy commented 5 years ago

Otto I just got this bug occurrence with a previously stable Bruh SensorNode when I did an update to it tonight from 1.12.2 to 1.13.6, and then fixed it (so far) with adding your suggested code below, sort of. As when I tried to reverse up and replicate the fix to be certain, I found it also was effected by me adding the Status LED component at the same time as I did your suggestion. Testing by adding one or the other only didn't fix it, but in my case when I have BOTH it does. What I'm thinking is sometimes it's not removing some code that fixes it, but ADDING new code also does in some cases.

BTW it looks like the same wifi 'error 201' log errors as I got a couple of months ago with the ESPhome Servo component, and the wifi looping it seemed to cause me back then. This SensorNode is also using the same batch of later version clone NodeMCU boards I have with ESP8266EX on them. I'm a little suspicious on the chip itself (maybe a fab silkscreening defect)...

"Add the following to your config esphome: on_boot:

glmnet commented 5 years ago

I’m sorry to pollute this thread but... there should be some common thing going on here. Had all this sensor run another firmware (Tasmota, espeasy) before running esphome before? Are some common hardware affected? (Eg sonoff shelly etc) wouldn’t make sense to build a table with cases?

ASchneiderBR commented 5 years ago

I’m sorry to pollute this thread but... there should be some common thing going on here. Had all this sensor run another firmware (Tasmota, espeasy) before running esphome before? Are some common hardware affected? (Eg sonoff shelly etc) wouldn’t make sense to build a table with cases?

Hello sir.

All of mine "problematic" devices were running Tasmota on a Sonoff Basic. Many devices that were changed to EspHome (about 15) from Tasmota are a mix of D1 Minis, NodeMcu v2, Sonoff Dual, Sonoff 4CH, ESP32 that had no problems at all. Most of the Sonoff Basics (25 from 37 of them) I was able to change from Tasmota to EspHome with no problems. All of them OTA using the update firmware function in Tasmota.

The odd part is that most devices that had problems reconnecting to the network were using the exact same project with different friendly names using substitutes, some reconnected and others did not.

Thanks.

Anonym-tsk commented 5 years ago

I changed 4 of 5 Sonoff Basic devices from ESPurna to ESPHome without problems. Only one device had this problem.

Anonym-tsk commented 5 years ago

My sonoff with problems looks like on this photo

But in tasmota wiki another board is described.

tomlut commented 5 years ago

#394 is an example of this as well.

The fix was to remove random things.

lord-carlos commented 5 years ago

Notice something else. I used a config that worked fine on my other sonoff devices to flash a sonoff basic that had ESPEasy running. It would then connect to the wifi, but only for a couple of seconds at a time. I had ping running. At first I thought it was another bug, because it actually connected to the wifi. But after removing the binary sensor and logging the wifi was flawless.

glmnet commented 5 years ago

Is there a way to dump firmware and compare?

I typed this with my thumbs

El 6 jul. 2019, a la(s) 05:47, bruxy70 notifications@github.com escribió:

I have this issue on one Sonoff Basic. Interesting that when I flash via serial, it works fine. When I upload the same code with no changes OTA, it does not connect. Then I flash again through serial, works again, OTA - stops working. Could it be something with the OTA process?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ng-galien commented 5 years ago

Same problems with Sonoff RF R2 Power V1.0 and standard config (as in the docs, Toggle relay with the button) after hours.

Removing logs and input sensor on the button did solve the problem.

Erd86 commented 5 years ago

Same issue here with a sonoff basic and ds18b20 sensors.. (Loose of connection and button does not work anymore) I'm using same sonoff Basic like on Anonym-tsk photo

marrobHD commented 5 years ago

Same with my esphome pool temp sensor. After restarting hassio it won't connect to WiFi back

jplitza commented 5 years ago

I'm having the same problem on 2 totally different devices: A Sonoff S20, and an ESP8266EX "Witty Cloud Board". They stopped working this morning when I restarted Home Assistant. Status LED is sometimes blinking slowly (warning), sometimes blinking quickly (error).

I only debugged the latter until now, and with level: VERY_VERBOSE it seems that the wifi connection actually is established, but DHCP doesn't work:

[20:07:26]connected with <ESSID>, channel 10
[20:07:26]dhcp client start...
[20:07:26][V][wifi_esp8266:345]: Event: Connected ssid='<ESSID>' bssid=00:11:22:33:44:55 channel=10
[20:07:26]wifi evt: 0
[20:07:36]pm open,type:1 0
[20:07:54][W][wifi:394]: Timeout while connecting to WiFi.

Setting a manual IP address then gives the "Beacon timeout" also meantioned above, but in the meantime, it's pingable!

[20:52:01]pm open,type:1 0
[20:52:08]bcn_timout,ap_probe_send_start
[20:52:11]ap_probe_send over, rest wifi status to disassoc
[20:52:11]state: 5 -> 0 (1)
[20:52:11]rm 0
[20:52:11]pm close 7

Up until now, nothing got the connection stable again. I tried loglevel warning, power_save_mode: light and fast_connect: True. However, WPA2 authentication seems to work, as a wrong PSK leads to the expected error messages "Auth failed" (or something like that). The Sonoff S20 of course has a binary_input, but the other one only has a custom made (multi-)sensor component, nothing more.

May the length of the WPA-PSK be relevant? Because mine's relatively long with 26 characters. APs are an AVM Fritz!Box and a TP-Link with OpenWrt.

Update: Actually, I got it working again! With manual_ip and, power_save_mode: light, fast_connect: True and all the components I had previously.

Update 2: Nope, only works for a minute or so. Then it hangs.

brandond commented 5 years ago

Just out of curiosity, what firmware are people coming from before putting esphome on these devices? I've got some dev boards that just came with blink code, but most of my 20-odd devices came with some sort of Tuya-based firmware on them, and I've not had any problem with them other than some devices not liking fast_connect: true.

All kinda makes me suspect there's some rfcal or efuse stuff going on...

jplitza commented 5 years ago

@brandond My Sonoff S20 was on original firmware before, my other ESP8266 had some custom firmware on it that read a temperature sensor.

ng-galien commented 5 years ago

I had tested 3 WiFi routers:

Apple Time Capsule: No way to disable 5ghz, random disconnects.

FAI router (both 5 and 2,4 too) Continuously disconnecting, reason beacon timeout.

Netgear r6250 2,4 only: Seems to be ok.

I use 9 different Sonoff basic, 7 on the router and 2 on a TP-Link AP500 in extender mode.

I will post the logs with the FAI router later

wibbly commented 5 years ago

I've just run into this on Wemos D1 minis.

They were fresh out of their anti-static bags (nothing preinstalled). I set up a few doing different jobs.

The bug first appeared when I added the 'switch:' config to one of the boards with a relay shield attached. It ran perfectly fine, connected to the network, etc ... until I added the relay (switch) config. After reading back through the messages here, I added "level: VERBOSE" to the logging config and it was fine again.

I didn't have any trouble with the other one using a NeoPixel LED shield, or the one with a button, or the one with a DHT11 sensor. All of these were independent Wemos boards and only the relay one caused a problem.

(I'm running the ESPHome beta as a Hass.io add-on.)

alfredopironti commented 5 years ago

Hi All,

I believe I've also been plagued with this problem, and have done some experiments, which I'm sharing here hoping they'll help.

I have two Sonoff Basic R2 devices (like in this picture). They were both previously flashed with TASMOTA, which may reinforce the information provided here, albeit it now seems the issue also affects pristine devices.

I'm using the most recent esphome, platformio and platformio-platforms in Linux.

In fact, on both devices I upgraded via OTA the very same config they had before, just going from ESPHome 1.12.2 to ESPHome 1.13.6. One change in the config was the use of substitutions and file inclusion; however, running esphome <filename.yaml> config for the configs with and without the substitutions returns binary-equal contents, and since substitutions are a pre-processing step, I believe this shouldn't have an effect.

After OTA upgrade, both devices wouldn't connect to Wi-Fi any longer. Connecting to the UART and reading the logs, I noticed the error reason was "AP not found", and one interesting item I noticed was that the bssid=00:00:00:00:00, that is, all zeros, rather than some specific bssid.

Re-flashing from UART didn't slove the problem. Changing the log level from default (DEBUG) to INFO solved the issue, and both devices can now connect.

Please note the devices share the exact same configuration, except for two strings, which identify the node name and the switch name. Such strings differ by about 5 characters; so, if the executable size has an effect on the matter, either 5 bytes didn't trigger the issue in my case, or constant strings may be stored in a region that doesn't generate the problem.

I'm afraid doing more test may be problematic as these devices are installed in very remote places, and connecting the UART required some effort. However, if I can provide further information (e.g. the exact config files, or software version etc) please let me know.

Thanks @OttoWinter for the hard work on this.

AnthonyKNorman commented 5 years ago

I am unable to perform OTA with ESP-01.

INFO Reading configuration...
INFO Generating C++ source...
INFO Compiling app...
INFO Running:  platformio run -d /config/esphome/esp_01_2
Processing esp_01_2 (framework: arduino; platform: espressif8266@1.8.0; board: esp01_1m)
--------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
CONFIGURATION: https://docs.platformio.org/page/boards/espressif8266/esp01_1m.html
PLATFORM: Espressif 8266 > Espressif Generic ESP8266 ESP-01 1M
HARDWARE: ESP8266 80MHz 80KB RAM (1MB Flash)
Library Dependency Finder -> http://bit.ly/configure-pio-ldf
LDF MODES: FINDER(chain) COMPATIBILITY(soft)
Collected 27 compatible libraries
Scanning dependencies...
Dependency Graph
|-- <ESP8266WiFi> 1.0
|-- <ESP8266mDNS>
|   |-- <ESP8266WiFi> 1.0
|-- <ESPAsyncTCP> 1.2.0
|   |-- <ESP8266WiFi> 1.0
Retrieving maximum program size /data/esp_01_2/.pioenvs/esp_01_2/firmware.elf
Checking size /data/esp_01_2/.pioenvs/esp_01_2/firmware.elf
Memory Usage -> http://bit.ly/pio-memory-usage
DATA:    [====      ]  39.0% (used 31984 bytes from 81920 bytes)
PROGRAM: [===       ]  30.9% (used 316328 bytes from 1023984 bytes)
========================= [SUCCESS] Took 7.18 seconds =========================
INFO Successfully compiled program.
INFO Resolving IP address of esp_01_2.local
INFO  -> 192.168.1.103
INFO Uploading /data/esp_01_2/.pioenvs/esp_01_2/firmware.bin (320480 bytes)
Uploading: [==                                                          ] 3% 
ERROR Error sending data: [Errno 32] Broken pipe

Here are the logs from the UART

[D][ota:072]: Starting OTA Update from 192.168.1.89...
[D][ota:243]: OTA in progress: 0.2%
[W][ota:233]: Error writing binary data to flash: 0 != 536!
[W][ota:276]: Update end failed! Error: ERROR[2]: Flash Erase Failed

[I][ota:046]: Boot seems successful, resetting boot loop counter.

The YAML is just a plain vanilla one

esphome:
  name: esp_01_2
  platform: ESP8266
  board: esp01_1m

wifi:
  ssid: "XXXXXXXXX"
  password: "XXXXXXXXX"

# Enable logging
logger:

# Enable Home Assistant API
api:
  password: "XXXXXXX"

ota:
  password: "XXXXXXXX"

I have tried this with two device and received identical results

brandond commented 5 years ago

@AnthonyKNorman please open a new issue, this is specifically for issues with esp8266 devices not finding/connecting to wifi networks.

whazor commented 5 years ago

My issue was as well that my two esp8266(nodemcu)'s failed to connect to wifi.

After reflashing I noticed that in the list of wifi networks, I could see my access points two times each. One MAC address with no SSID and one with SSID. Apparently, the 5ghz wifi adapter has a different MAC address than the 2.4ghz wifi adapter. Also, as I have multiple access points, there is also the chance of connecting to the wrong one.

To fixate the ESP to a certain frequency band and access point, I filled in the bssid via:

wifi:
  networks:
  - ssid: "NetworkName"
     password: "pass"
     bssid: xx:xx:xx:xx:xx:xx

This seems to work until now.. will test further