arendst / Tasmota

Alternative firmware for ESP8266 and ESP32 based devices with easy configuration using webUI, OTA updates, automation using timers or rules, expandability and entirely local control over MQTT, HTTP, Serial or KNX. Full documentation at
https://tasmota.github.io/docs
GNU General Public License v3.0
21.85k stars 4.75k forks source link

Unstable ping with Unifi access point #3780

Closed fotiDim closed 5 years ago

fotiDim commented 5 years ago

Make sure these boxes are checked [x] before submitting your issue - Thank you!

(Please, remember to close the issue when the problem has been addressed)

The ping is highly unstable for some reason on 2 S20s that I tried. I am using the latest sonoff.bin downloaded today from Github releases.

64 bytes from 192.168.0.203: icmp_seq=976 ttl=128 time=615.050 ms
64 bytes from 192.168.0.203: icmp_seq=977 ttl=128 time=107.754 ms
64 bytes from 192.168.0.203: icmp_seq=978 ttl=128 time=17.868 ms
64 bytes from 192.168.0.203: icmp_seq=979 ttl=128 time=29.991 ms
Request timeout for icmp_seq 980
Request timeout for icmp_seq 981
Request timeout for icmp_seq 982
Request timeout for icmp_seq 983
Request timeout for icmp_seq 984
Request timeout for icmp_seq 985
Request timeout for icmp_seq 986
Request timeout for icmp_seq 987
Request timeout for icmp_seq 988
Request timeout for icmp_seq 989
Request timeout for icmp_seq 990
64 bytes from 192.168.0.203: icmp_seq=991 ttl=128 time=194.596 ms
64 bytes from 192.168.0.203: icmp_seq=992 ttl=128 time=101.530 ms
64 bytes from 192.168.0.203: icmp_seq=993 ttl=128 time=115.332 ms
64 bytes from 192.168.0.203: icmp_seq=994 ttl=128 time=48.569 ms
Request timeout for icmp_seq 995
64 bytes from 192.168.0.203: icmp_seq=996 ttl=128 time=21.333 ms
64 bytes from 192.168.0.203: icmp_seq=997 ttl=128 time=18.896 ms
64 bytes from 192.168.0.203: icmp_seq=998 ttl=128 time=337.202 ms
Request timeout for icmp_seq 999
64 bytes from 192.168.0.203: icmp_seq=1000 ttl=128 time=76.171 ms
64 bytes from 192.168.0.203: icmp_seq=1001 ttl=128 time=59.231 ms
64 bytes from 192.168.0.203: icmp_seq=1002 ttl=128 time=21.367 ms
64 bytes from 192.168.0.203: icmp_seq=1003 ttl=128 time=54.954 ms
64 bytes from 192.168.0.203: icmp_seq=1004 ttl=128 time=87.545 ms
64 bytes from 192.168.0.203: icmp_seq=1005 ttl=128 time=199.862 ms
64 bytes from 192.168.0.203: icmp_seq=1006 ttl=128 time=121.973 ms
64 bytes from 192.168.0.203: icmp_seq=1007 ttl=128 time=84.987 ms
64 bytes from 192.168.0.203: icmp_seq=1008 ttl=128 time=60.911 ms
64 bytes from 192.168.0.203: icmp_seq=1009 ttl=128 time=74.111 ms
Request timeout for icmp_seq 1010
64 bytes from 192.168.0.203: icmp_seq=1011 ttl=128 time=157.315 ms
64 bytes from 192.168.0.203: icmp_seq=1012 ttl=128 time=142.097 ms
64 bytes from 192.168.0.203: icmp_seq=1013 ttl=128 time=41.044 ms
64 bytes from 192.168.0.203: icmp_seq=1014 ttl=128 time=53.302 ms
64 bytes from 192.168.0.203: icmp_seq=1015 ttl=128 time=214.672 ms
64 bytes from 192.168.0.203: icmp_seq=1016 ttl=128 time=325.445 ms
Request timeout for icmp_seq 1017
Request timeout for icmp_seq 1018
Request timeout for icmp_seq 1019
Request timeout for icmp_seq 1020
ascillato commented 5 years ago

Hi,

Using your status 0:

{"Status":{"Module":8,"FriendlyName":["Toaster"],"Topic":"sonoff","ButtonTopic":"0","Power":0,"PowerOnState":3,"LedState":1,"SaveData":1,"SaveState":1,"ButtonRetain":0,"PowerRetain":0},"StatusPRM":{"Baudrate":115200,"GroupTopic":"sonoffs","OtaUrl":"http://sonoff.maddox.co.uk/tasmota/sonoff.bin","RestartReason":"Software/System restart","Uptime":"0T00:38:31","StartupUTC":"2018-09-12T18:15:57","Sleep":0,"BootCount":14,"SaveCount":32,"SaveAddress":"F8000"},"StatusFWR":{"Version":"6.2.1","BuildDateTime":"2018-09-09T16:50:26","Boot":31,"Core":"2_3_0","SDK":"1.5.3(aec24ac9)"},"StatusLOG":{"SerialLog":2,"WebLog":2,"SysLog":0,"LogHost":"","LogPort":514,"SSId":["myssid","myssid"],"TelePeriod":300,"SetOption":["00008001","55A18000","00000000"]},"StatusMEM":{"ProgramSize":471,"Free":532,"Heap":11,"ProgramFlashSize":1024,"FlashSize":1024,"FlashMode":3,"Features":["00000809","0FDAE794","000003A0","23B617CE","00000000"]},"StatusNET":{"Hostname":"toaster","IPAddress":"192.168.0.203","Gateway":"192.168.0.1","Subnetmask":"255.255.255.0","DNSServer":"192.168.0.1","Mac":"5C:CF:7F:7F:EA:10","Webserver":2,"WifiConfig":5},"StatusTIM":{"UTC":"Wed Sep 12 18:54:28 2018","Local":"Wed Sep 12 19:54:28 2018","StartDST":"Sun Mar 25 02:00:00 2018","EndDST":"Sun Oct 28 03:00:00 2018","Timezone":1,"Sunrise":"06:22","Sunset":"19:10"},"StatusSNS":{"Time":"2018-09-12T19:54:28"},"StatusSTS":{"Time":"2018-09-12T19:54:28","Uptime":"0T00:38:31","Vcc":3.153,"POWER":"OFF","Wifi":{"AP":1,"SSId":"myssid","RSSI":78,"APMac":"F0:9F:C2:F4:99:E2"}}}

in decode-status.py, it outputs:

*** decode-status.py v20180730 by Theo Arends ***
Decoding information for device Toaster from status report taken at 2018-09-12T1
9:54:28

Options
   0 (ON ) Save power state and use after restart
   1 (OFF) Restrict button actions to single, double and hold
   2 (OFF) Show value units in JSON messages
   3 (OFF) MQTT enabled
   4 (OFF) Respond as Command topic instead of RESULT
   5 (OFF) MQTT retain on Power
   6 (OFF) MQTT retain on Button
   7 (OFF) MQTT retain on Switch
   8 (OFF) Convert temperature to Fahrenheit
   9 (OFF) MQTT retain on Sensor
  10 (OFF) MQTT retained LWT to OFFLINE when topic changes
  11 (OFF) Swap Single and Double press Button
  12 (OFF) Do not use flash page rotate
  13 (OFF) Button single press only
  14 (OFF) Power interlock mode
  15 (ON ) Do not allow PWM control
  16 (OFF) Reverse clock
  17 (OFF) Allow entry of decimal color values
  18 (OFF) CO2 color to light signal
  19 (OFF) HASS discovery
  20 (OFF) Do not control Power with Dimmer
  21 (OFF) Energy monitoring while powered off
  22 (OFF) MQTT serial
  23 (OFF) MQTT serial binary
  24 (OFF) Rules once mode until 5.14.0b
  25 (OFF) KNX enabled
  26 (OFF) Use Power device index on single relay devices
  27 (OFF) KNX enhancement
  28 (OFF) RF receive decimal
  29 (OFF) IR receive decimal
  30 (OFF) Enforce HASS light group
  31 (OFF) Do not show Wifi and Mqtt state using Led
  50 (OFF) Timers enabled
  51 (OFF) Generic ESP8285 GPIO enabled
  52 (OFF) Add UTC time offset to JSON message

Features
  Language LCID = 2057
  MQTT_HOST_DISCOVERY
  MQTT_PUBSUBCLIENT
  USE_ADC_VCC
  USE_ARILUX_RF
  USE_BH1750
  USE_BMP
  USE_DHT
  USE_DISCOVERY
  USE_DISPLAY_LCD
  USE_DISPLAY_MATRIX
  USE_DISPLAY_MODES1TO5
  USE_DISPLAY_SSD1306
  USE_DOMOTICZ
  USE_DS18x20
  USE_EMULATION
  USE_ENERGY_SENSOR
  USE_HOME_ASSISTANT
  USE_HTU
  USE_I2C
  USE_IR_RECEIVE
  USE_IR_REMOTE
  USE_LM75AD
  USE_MHZ19
  USE_NOVA_SDS
  USE_PMS5003
  USE_PZEM004T
  USE_RULES
  USE_SENSEAIR
  USE_SERIAL_BRIDGE
  USE_SGP30
  USE_SHT
  USE_SHT3X
  USE_SR04
  USE_SUNRISE
  USE_TIMERS
  USE_TIMERS_WEB
  USE_WEBSERVER
  USE_WS2812
  WEBSERVER_ADVERTISE

So, everything in your config seems ok.

May be you are having wifi issues in your network? Or too much wifi devices? Or wifi channels overlap with other wifi networks?

To discard problems in your firmware, you can update to the latest development version if you want. In my devices at home the ping is stable.

fotiDim commented 5 years ago

@ascillato2 I am pinging other devices at the same time and their ping is stable. I tried the dev version you suggest using file upload and the problem remains. Is there any sleep setting that I need to change?

ascillato2 commented 5 years ago

If your status 0 is correct, you have sleep set to 0. So, there isn't anything else.

For the ping I just use the ping command from a windows console.

fotiDim commented 5 years ago
ascillato commented 5 years ago

Please, use the command weblog 4 in the Tasmota console to see if there is a disconnection of something being reported there.

andrethomas commented 5 years ago

RSSI of the device looks good so its well within wifi range - More likely your wifi setup than the device/firmware, or other interference issues.

My most remote device has a RSSI of 66 (lower than yours) and my ping results:

Pinging 192.168.42.6 with 32 bytes of data:
Reply from 192.168.42.6: bytes=32 time=1ms TTL=128
Reply from 192.168.42.6: bytes=32 time=1ms TTL=128
Reply from 192.168.42.6: bytes=32 time=1ms TTL=128
Reply from 192.168.42.6: bytes=32 time=1ms TTL=128

The device is running prebuilt development binary.

fotiDim commented 5 years ago

weblog 4 gives me:

21:22:13 CMD: weblog 4
21:22:13 SRC: WebConsole from 192.168.0.249
21:22:13 RSL: Received Topic /weblog, Data Size 1, Data 4
21:22:13 RSL: Group 0, Index 1, Command WEBLOG, Data 4
21:22:13 RSL: RESULT = {"WebLog":4}

My wifi is a set of Ubiquity Unifi access points. All my other devices work flawlessly and have stable ping. Any though thoughts where I could look into my access point setup?

ascillato commented 5 years ago

mmm, there were some issues with Access points before. May be that is the problem. The Wifi libraries for ESP8266 devices have some issues with some access points. Please, try to connect your Sonoff directly to your main router and try again to ping it.

fotiDim commented 5 years ago

There is no wireless router in my case. Only those access points: https://www.ubnt.com/unifi/unifi-ap-ac-lite

ascillato commented 5 years ago

See for example this comment in another issue: https://github.com/arendst/Sonoff-Tasmota/issues/3262#issuecomment-419724799

ascillato commented 5 years ago

There is another thing to try. Recompile Tasmota but using esp lib core v2.4.2 instead of the v2.3.0 that you have now. May be this helps with your connection stability with your access points

fotiDim commented 5 years ago

I can try that. Thanks for the tip. Is there any way that I can disable mDNS without recompiling?

ascillato commented 5 years ago

sorry, no. you need to recompile

Frogmore42 commented 5 years ago

I would try turning off Hue emulation. It has caused problems on some people's networks.

fotiDim commented 5 years ago

@Frogmore42 tried and did absolutely nothing. I guess recompiling is my only option.

Frogmore42 commented 5 years ago

Did you update the FW on the access points? I would do that first. If that doesn't help try core 2.4.2. I am using it LWIP 2 high bandwidth.

fotiDim commented 5 years ago

Access points are updated yes. I took the S20 to the office today and here with this WiFi it works fine. So clearly there is an incompatibility between Unifi access points and Sonoff or more specifically Tasmota.

I will try to downgrade and also to recompile a custom version today.

@Frogmore42 where can I find information about LWIP 2 high bandwidth?

d-a-v commented 5 years ago

Not willing to step on any toes, the ping problem is well known on the esp8266/arduino core side, Current easiest solution is to use WiFi.setSleepMode(WIFI_NONE_SLEEP) The down side of this is that it has more current consumption. This may be solved in a better way when arduino folks will update to the espressif's V3 SDK.

About LWIP2, it is with the 2.4.2 arduino core (variant: v2 Higher Bandwidth).

(sidenote: I ordered my first sonoff modules, and I love the KNX functionality)

ascillato commented 5 years ago

So clearly there is an incompatibility between Unifi access points and Sonoff

I'm sorry to say that you are right. There are several issues naming Unifi Access Points. Other brands work fine. I don't know what have this one in particular. May be you can play with its internal parameters.

ascillato2 commented 5 years ago

Closing issue as the problem is outside Tasmota software. (Unifi Access Points and ESP8266 core libraries).

If you find any workaround to this, please share it. Thanks.

fotiDim commented 5 years ago

Some findings:

This looks more like sleep that the ping output I posted initially.

Frogmore42 commented 5 years ago

None of that changes what has been said before. This is very likely a low level issue with the SDK code from espressif or from the Arduino core that wraps it. Core 2.4.2 includes the option for lwip2, which is what espressif has said they will be supporting moveing forward. Ping is typically handled at very low levels in the code. The only way Tasmota can impact it is by taking too long to return control to the core software.

Any system that uses a network needs to handle the kinds of network behavior you are experiencing. Are you having a real issue, or only when you try and ping for an extended period of time?

Most people would not notice the type of behavior you are seeing. I have a NodeMCU that probably has a bad 3v3 regulator on it. If it was the only device I had I would be blaming Tasmota for its issues. I only noticed it when I looked its up time reports and saw it was restarting regularly, sometimes in less than a minute sometimes not for several hours.

If you only have one type of device that is giving you issues, replace the device. It is probably bad. I don't believe I have ever pinged one of my devices for any length of time, but I would not be surprised if it didn't respond sometimes. If I was really interested in figuring it out, I would write a very simple Arduino program that does nothing else but respond to pings and put that on the problematic device. If it worked fine, I would then make the program more complicated to see where it fails. But, I not really interested in it enough to do that, since all of my devices are working fine in my environment or I understand why they are not.

fotiDim commented 5 years ago

@Frogmore42 I tried 2.4.2 but didn't fix my issue. I also tried disabling sleep and I was still getting timeouts. And yes it is a real issue. Web interface was not working and also Home Assistant lost connectivity. I tried multiple S20s. All have the same issue.

I tried now to revert 20a5395 as suggested here. Things are more stable but still not perfect. I have ~5% packet loss. I am continuing to investigate.

Frogmore42 commented 5 years ago

Core 2.4.2 can be used with three varieties of lwip, which did you try?

Have you tried a simple WiFi example from the esp8266 Arduino repo? Does it have the same issue with ping?

fotiDim commented 5 years ago

@Frogmore42 I did a thorough test. I set up 3 x S20s, all with 2.4.2 and each one with a different lwip setting. All other settings were as default. All the S20s were plugged into the same power source and had direct eye contact to the AP. I left simultaneous pings going on for a couple of hours. Results:

LWIP_HIGHER_BANDWIDTH : 16680 packets transmitted, 16610 packets received, 0.4% packet loss LWIP2_LOW_MEMORY: 16677 packets transmitted, 16341 packets received, 2.0% packet loss LWIP2_HIGHER_BANDWIDTH: 16681 packets transmitted, 16296 packets received, 2.3% packet loss

I also tried an Arduino Wifi example but didn't work for me. Not sure if it's worth spending more time with it.

Is seems LWIP_HIGHER_BANDWIDTH is more stable.

Frogmore42 commented 5 years ago

LWIP (not 2) does not work across long distances, ie VPN. LWIP2 High bandwidth is supposed to work better over VPN. It is also the path that Espressif has said they will be supporting. If lwip is working for your situation, use it. There are some other security reasons to move to lwip2, but most people are not too worried about security. If you are there are not a lot of good choices, as most IOT devices have pretty poor security.