xoseperez / espurna

Home automation firmware for ESP8266-based devices
http://tinkerman.cat
GNU General Public License v3.0
3k stars 638 forks source link

Sonoff Switch: Web server not available #24

Closed xoseperez closed 6 years ago

xoseperez commented 7 years ago

Originally reported by: Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin)


Steps to reproduce:

  1. download espurna

  2. change "platform" from "espressif8266" to "espressif8266_stage" in platformio.ini

  3. upload and uploadfs with "pio run"

  4. connect to wifi network SONOFF_XXXXXX make setup

  5. try to enter to the web interface in the router's wifi, but it's fail, seems something wrong with web server:

Resource interpreted as Document but transferred with MIME type application/octet-stream: "http://192.168.0.104/".

also it propose to download some "download" file in browser

xoseperez commented 7 years ago

Original comment by J.D. (Bitbucket: Harry_Reutter, GitHub: Unknown):


short update, i did a reset via MQTT and then the Webinterface was working again...but after 20min the heap is now 19584...if i remember right you said that heap should always be above 20k ? ?

xoseperez commented 7 years ago

Original comment by J.D. (Bitbucket: Harry_Reutter, GitHub: Unknown):


Hi,

I think my post from yesterday got lost somewhere, so here i go again..

i have a strong feeling that this is somehow connected to some sort of DNS issue..i can 100% replicate the issue on my setup...

when i use the IP address , the problem never happens and i have access to the web interface when i use the DNS name, the webserver crashes and the browser (any browser) is loading forever..

here is the debug log : log.jpg

the first 2 GET request was using IP Address..as you can see, it gets authenticated and then connected..after each test i closed the browser and did it again.. the 3rd GET request was using DNS name...as you can see, the webserver hangs and there is no WEBSOCKET entry...the device itself is still working, as you can see an the MQTT,BEAT and NTP entries... also the Heap is above 20k.....

since i'm coming from the network world, i don't really understand what the difference is since DNS is just mapping the IP to an name .server wise it shouldn't make a difference...maybe a faulty TCP/IP implementation on the webserver side ? ?

i hope this helps,

Harry

xoseperez commented 7 years ago

Could be... not at home now to test. The skype call is an option but can't right now.

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


Maybe this is happening?

  1. when I make double-click device create AP

  2. then it's lost MQTT connection

  3. try to recconect and drop the own wifi network

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


Double-clicking does not work for me, wifi network is created, but not in time to join the network - it's disappear. Here is the log from Adruino IDE from COM3 port (it's have more better format then from Putty)

#!arduino
[WIFI] Creating access point
[WIFI] MODE AP --------------------------------------
[WIFI] SSID SONOFF_E706A9
[WIFI] PASS myPassword
[WIFI] IP   192.168.4.1
[WIFI] MAC  5E:CF:7F:E7:06:A9
[WIFI] ----------------------------------------------
[MQTT] Disconnected!
[WEBSOCKET] Broadcasting '{"mqttStatus": false}'
[WIFI] Connecting to Amsterdam
[WIFI] MODE STA -------------------------------------
[WIFI] SSID Amsterdam
[WIFI] IP   192.168.0.102
[WIFI] MAC  5C:CF:7F:E7:06:A9
[WIFI] GW   192.168.0.1
[WIFI] MASK 255.255.255.0
[WIFI] DNS  192.168.0.1
[WIFI] HOST SONOFF_E706A9
[WIFI] ----------------------------------------------
[MDNS] OK
[NTP] Error: NTP server not reachable
[MQTT] Connecting to broker at 192.168.0.200
[MQTT] Connected!
[MQTT] Sending /SONOFF_E706A9/ip => 192.168.0.102
[MQTT] Sending /SONOFF_E706A9/version => 1.6.3
[MQTT] Subscribing to /SONOFF_E706A9/action
[WEBSOCKET] Broadcasting '{"mqttStatus": true}'
[MQTT] Sending /SONOFF_E706A9/relay/0 => 1
[MQTT] Subscribing to /SONOFF_E706A9/relay/+
[MQTT] Subscribing to domoticz/out
[MQTT] Subscribing to /SONOFF_E706A9/led/+
[MQTT] Received /SONOFF_E706A9/relay/0 => 1 - SKIPPED
[MQTT] Received /SONOFF_E706A9/relay/0 => 1 - SKIPPED
[NTP] Time: 11:03:10 17/02/2017

I can propouse to have a Skype call? I really want to solve the problem and help to your project, but it's up to you. Also, feel free to ask more questions/log info and I will provide it here.

#!arduino
[BEAT] Free heap: 22760
xoseperez commented 7 years ago

I can't think of a reason for that... You said it worked the first time when you connected to the board in AP mode, right? Can you do it again (double clicking the button should set the ESP8266 in AP mode)? Also: in the debug log it reports the free heap every 5 minutes. It should be above 20Kb.

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


I am not sure if you have build and uploaded the latest web pages / fs ... it should be a single index.html.gz file now.

Correct, gulp created only one file index.html.gz

Do you see the GET / request in the debug log?

Yes, I saw it in the log when connect via Putty to my COM3

xoseperez commented 7 years ago

Do you see the GET / request in the debug log?

xoseperez commented 7 years ago

Original comment by f-fish (Bitbucket: f-fish, GitHub: Unknown):


I am not sure if you have build and uploaded the latest web pages / fs ... it should be a single index.html.gz file now.

Let me wipe and test a device quickly... BTW I am using platformio.

Later Ferdie

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


You are right, I mean that all sources is up to date. At first boot device create wifi network and I setup it as usual. When device connected to router's wifi network I pass Basic HTTP Auth, and that's all. Web interface not available.

xoseperez commented 7 years ago

Hum.. you should have updates to the code (point 2). Last commit to master is 18 hours ago. If you wipe tha board memory you have likely lost the configuration so I wouldn't expect the board to connect to your network. Check if it has created an AP.

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


What I done:

  1. remove folders .pioenvs .piolibdeps

  2. git pull => Your branch is up-to-date with 'origin/master'

  3. esptool.py -p COM3 -b 115200 erase_flash

  4. build and upload firmware and file system

  5. unfortunately, issue still there

Seems, error changed. Right now, Edge don't propose to download a file and Chrome change the error to1.63.error.JPG

xoseperez commented 7 years ago

I have added two changes in version 1.6.3 that should impact this behaviour:

  1. The filesystem uses (again) on single file with the HTML, JS and CSS merged and compressed.
  2. The fauxmoESP library has been updated to v2.1.0 that solves some memory leaks in the TCP connection handling

Still, this will need further testing since I have the feeling there are different reasons for this.

xoseperez commented 7 years ago

Original comment by Xavier Smith (Bitbucket: xavier, GitHub: Unknown):


After firmware and filesystem update (1.6.0 release) with these commands

#!

pio run -e electrodragon-debug -t upload
pio run -e electrodragon-debug -t uploadfs

this is my output from PlatformIO serial monitor:

#!

--- Miniterm on COM8  115200,8,N,1 ---
--- Quit: Ctrl+C | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[WEBSERVER] Request: GET /
[DHT] Error reading sensor
[WEBSERVER] Request: GET /
[DHT] Error reading sensor
[WEBSERVER] Request: GET /
[DHT] Error reading sensor
[WEBSERVER] Request: GET /
[DHT] Error reading sensor
[DHT] Error reading sensor
[DHT] Error reading sensor
[DHT] Error reading sensor

After looonnnggg time, i get the same web interface of Pavel Eremin "best result".

No one button functional, if I press, no reply.

Best result with old 1.4.0 firmware: good web interface at power on, unreachable after some time.

xoseperez commented 7 years ago

Original comment by J.D. (Bitbucket: Harry_Reutter, GitHub: Unknown):


Hi, I run into the same issue..i don't know if this helps..but this occurs everytime i try to load the web page from my android phone..using chrome,firefox,etc. doesn't make a difference. so it goes like this. enter IP of webserver....the browser is unable to load the page..if you press cancel you see the screen Pavel posted. after that the webserver is not running anymore and the browser get's an connection closed with 'ERR_INVALID_HTTP_RESPONSE'.....

I could never reproduce this on my notebook using chrome,firefox,edge,IE....... I'm not a web programmer, but i get the strong feeling that this is related to the way the CSS sheets are handled....

i hope this was of any help,

Cheers, Harry

xoseperez commented 7 years ago

@matantal Can you connect your USB2UART board to the device and see the debug output when it boots and when you do a request?

xoseperez commented 7 years ago

Original comment by matantal (Bitbucket: matantal, GitHub: matantal):


This is exactly the commands I used. I meant I flashed twice. The uploadfs was after I flashed the firmware.

I get the same issue as the original post. The webserver does not respond when connected to an SSID.

xoseperez commented 7 years ago

Not the firmware, the filesystem. You have to flash two different images on the device:

pio run -e sonoff-debug -t upload
pio run -e sonoff-debug -t uploadfs
xoseperez commented 7 years ago

Original comment by matantal (Bitbucket: matantal, GitHub: matantal):


@xoseperez Yes, flashed successfully using platformio. This is after flashing the firmware . Followed your wiki step-by-step.

xoseperez commented 7 years ago

@matantal Have you flashed the filesystem image?

xoseperez commented 7 years ago

Original comment by matantal (Bitbucket: matantal, GitHub: matantal):


@xoseperez Hi Xose, Same issue here, first setup OK. then when accessing the webserver it only prompt for user and password but page won't load. version 1.6 , sonoff th16.

xoseperez commented 7 years ago

@clueo8 Since version 1.6.0 you can reset the board remotely using MQTT (send a message to your root topic + /action with payload "reset") or using RPC (http://yourip/rpc?apikey=XXXXX&action=reset).

@clueo8 That will only work if you have internet connection in you PC and that is not true if you are connected to the device in AP mode to configure it the first time...

xoseperez commented 7 years ago

Original comment by Tim K (Bitbucket: clueo8, GitHub: clueo8):


Would it be helpful to off-load jquery to a cdn instead of loading it into spiffs? For example, remove jquery and just include:

#!html

<script src="//ajax.googleapis.com/ajax/libs/jquery/1.12.4/jquery.min.js"></script>
xoseperez commented 7 years ago

Original comment by Tim K (Bitbucket: clueo8, GitHub: clueo8):


Is there any way remotely to reset the device, freeing up the memory to get the web server back (API, websocket)?

xoseperez commented 7 years ago

The Amazon Dot (I uess the Echo as well) run the discovery process from time to time on their own. No need for you to trigger it. There is a huge memory leak on the fauxmoESP library and after 10 discoveries the free heap is below 5k and the web page becomes irresponsible.

xoseperez commented 7 years ago

Original comment by Tim K (Bitbucket: clueo8, GitHub: clueo8):


I only really use this for the Alexa integration (No MQTT setup)... It seems to happen very frequently that the web interface becomes unreachable. I have not run the actual Alexa (Dot) Discover process in sometime.

xoseperez commented 7 years ago

This is a dump from the debug log of one device I have running:

[BEAT] Free heap: 21776
[NTP] Time: 09:46:11 31/01/2017
[MQTT] Sending /test/switch/TH16/status => 1
[BEAT] Free heap: 21776
[NTP] Time: 09:51:11 31/01/2017
[MQTT] Sending /test/switch/TH16/status => 1
[BEAT] Free heap: 21776
[NTP] Time: 09:56:11 31/01/2017
[FAUXMO] Search request from 192.168.1.118
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] Search request from 192.168.1.118
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] Search request from 192.168.1.100
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] Search request from 192.168.1.100
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] UDP response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[FAUXMO] /setup.xml response for device #0 (TH16)
[MQTT] Sending /test/switch/TH16/status => 1
[BEAT] Free heap: 18784
[NTP] Time: 10:01:11 31/01/2017
[MQTT] Sending /test/switch/TH16/status => 1
[BEAT] Free heap: 18784
[NTP] Time: 10:06:11 31/01/2017
[NTP] Time: 10:11:05 31/01/2017
[MQTT] Sending /test/switch/TH16/status => 1

As you can see after the Alexa device (a Amazon Dot in my case) did a discovery (it does it from time to time) the free heap is 2.5K less...

I'm opening an issue in the fauxmoESP library repo (https://bitbucket.org/xoseperez/fauxmoesp/issues/5/memory-leak). Won't close this one thou.

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


Nope. First device (Sonoff Switch) seems even didn't had this option, and at second one (Sonoff POW) I turn it off in web interface at first time setup.

xoseperez commented 7 years ago

OK, I'm 90% sure it's a memory leak that empties free heap and prevent the ESPAsyncWebServer library from serving big files. I'm doing tests right now but I suspect it's the Alexa integration. Are you using it?

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


@xoseperez because when Sonoff work as AP I can't enter to the web interface, because of this issue. But when it create own wifi network - all is ok.

xoseperez commented 7 years ago

@paveleremin Why do you turn off your router to enter the Sonoff web interface?

xoseperez commented 7 years ago

This one is really hard to catch... I've been running a Sonoff TH for days without any issue :( If anyone has a step-to-reproduce I will very much thank you.

xoseperez commented 7 years ago

Original comment by Tim K (Bitbucket: clueo8, GitHub: clueo8):


I too am experiencing this issue. The web page works on a fresh restart, but after some time it becomes unavailable. API calls (i.e. Alexa) still work (response is slow sometimes), but the web server is inaccessible. Is there any workarounds for this? Possibly host the web server external to the device and just have it use the API to control the relays?

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


Today I got interesting behaviour:

1) I need to enter to the web interface

2) So I turn OFF mains on device and turn off router

3) Then I turn ON mains on device and expect to see new wifi network like SONOFF_POW_ABCD12

4) But it's not created :question:

5) I repeat 1-3 steps 3 times and nothing

6) I turn OFF mains on device, connect via USB uart (3.3 volt, RX, TX, GND)

7) Voila, new wifi network is created

@xoseperez is it correct behaviour? What do you think?

xoseperez commented 7 years ago

I'm working on this issue right now but I still cannot reproduce it with confidence, i.e. it happens randomly after the device has been working right for some hours (at least to me). When it happens only the request to SPIFFS files fail (the browser tries to download a binary file, no known format). But the requests to dynamically generated content work just fine (API calls, for instance). The device works as expected otherwise (MQTT, relays, led,...). So my guess right now is still on the SPIFFS handling.

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


Same issue with Sonoff POW :( @xoseperez how can I fix it temporary? Maybe I could comment some lines of code?

xoseperez commented 7 years ago

I thinking this might be due to the same problem reported here: https://github.com/me-no-dev/ESPAsyncWebServer/issues/115. A combination of several files being requested in parallel makes the uC run out of memory and the result is a corrupted response... The web interface is quite heavy and lately I split it in files to implement the "default password change" functionality (0f5a0e8).

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


@xoseperez http://{id}.local not work for me, so I'm use IP.

This how it's look like in the Edge browser: 1234.JPG

That's all what I see in COM port

[WEBSERVER] Request: GET /

Here is content of "download" file when I try to load url http://192.168.0.101/styles.css

3da4 9f3b 1f0b d6f2 a5f2 7333 12d2 3f95
350b f8df e939 17f7 4493 427c 0cc6 fdbb
9f6c 617c 2f03 7903 bbe0 3f84 541c 4a7c
bfea bcbd c23b 8f03 b0e0 93cd 2c76 aa57
31de cf35 3db3 7f96 f817 9109 bc76 8580
92ef 7fc3 3b17 56c4 72f7 0b97 bac8 6cf0
e909 f1e1 0300 7ef3 fe1b 28fa fb13 00cc
4c2f 7c62 8ca6 f6c7 2d6c 5810 51b8 f4ab
cc11 578a f83d 66ba fd06 8544 2b5f 6adf
6881 15cc 8731 3a0d f697 ba62 ddc4 75ac
3fed 7bae c465 7657 0cae a1a8 1917 4a21
bf0c d4e5 c8df ad03 aaec 92d9 6eb6 ebaa
2455 f6cc de48 20eb 2e99 4c10 2ecb 1962
6aa4 69a8 ed06 f3b2 dbae e6d3 8e2c 65b9
938c f42f 8cce 4997 ee97 f1d4 dc24 3b90
6be5 22d6 5cff c61f eb37 a68d 7371 839a
202c f9f4 8cff bb82 f032 8600 8523 4730
fc8b d240 a43c 1f45 cd92 6738 e762 042b
867b c30c f838 0ff9 2c5a 6c5d 3811 4e28
4d9a 2373 7d0b 6d34 56b3 c9f5 95b7 ff01
46db cd82 7670 0000
xoseperez commented 7 years ago

That's somewhat weird. In the screenshot it looks like only the HTML got loaded, no style, no data (so no websocket connection either),...

I would gather some more info:

One question: does it happen when using the IP of the device and/or the local name (like http://_hostname_.local)?

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


@xoseperez correct

xoseperez commented 7 years ago

You mean you can see the web interface correctly only when connected to the AP the device creates, but not when it connects to your home wifi network?

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


  1. I turned off the router

  2. New wifi network created by device

  3. There all web ui is available :confused:

Can't figure out where is the problem

xoseperez commented 7 years ago

Original comment by Pavel Eremin (Bitbucket: paveleremin, GitHub: paveleremin):


it's the "best result", that I got from web server: 123.JPG