esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
15.99k stars 13.33k forks source link

ESP8266HTTPClient error -1 #6180

Closed Loucotolo closed 4 years ago

Loucotolo commented 5 years ago

Basic Infos

Platform

Settings in IDE

Problem Description

Hello, I have the following problem: I am using a esp8266, via an httpclient to connect to a server, but after some time (random), it returns me the following error: httpCode: -1 (Connection Refused). With wireshark I noticed that it [TCP PORT numbers reused] 57490-> 8000 [SYN].

I would like to know what I am doing wrong in my code to receive this error.

[Sketch]

#include <Arduino.h>
#include <ESP8266WiFi.h> /// fix http://blog.flynnmetrics.com/uncategorized/esp8266-exception-3/
#include <ESP8266HTTPClient.h>
//#include <WiFiClient.h>
#include <ESPAsyncWebServer.h>
#include <ArduinoJson.h>
#include <ESP8266httpUpdate.h>
#include <Hash.h>
#include <FS.h>
#include <user_interface.h> //Biblioteca necessaria para acessar os Timer`s.

#include "OneButton.h" // https://github.com/mathertel/OneButton
#include <EEPROM.h>
#include <WiFiUdp.h>

bool connect2WiFI()
{

  WiFi.setAutoReconnect( true );

#if (ADEBUG == 1 )
  Serial.printf("Wifi State changed to %s\n", WlStatusToStr(WiFi.status()));
#endif

  WiFi.persistent(false);   // Solve possible wifi init errors (re-add at 6.2.1.16 #4044, #4083)
  WiFi.disconnect(true);    // Delete SDK wifi config
  delay(200);
  WiFi.mode(WIFI_STA);      // Disable AP mode
  WiFi.setSleepMode(WIFI_MODEM_SLEEP);  // Disable sleep (Esp8288/Arduino core and sdk default)

  WiFi.begin("DEMO", "DEMO");

  uint32_t AwifiTimeout = 20000;
  uint32_t maxTime = millis() + AwifiTimeout;

  while ((WiFi.status()  != WL_CONNECTED) && (millis() < maxTime)) {
    yield();
  }

  if (WiFi.status()  != WL_CONNECTED)
  {

    return false;
  }

  Serial.println(WiFi.localIP());         // Send the IP address of the ESP8266 to the computer

  return true;

}

bool test()
{

more_send_data:

  int txlen = 0;

  WiFiClient client;

  HTTPClient http;

  http.setTimeout(500); // 500ms
  String md_ip = "192.168.0.31";
  String md_port = "8000";
  String path = "http://" + md_ip + ":" + md_port + "/index.php";

  const char * headerkeys[] = {"Set-Cookie", "Cookie"} ;
  size_t headerkeyssize = sizeof(headerkeys) / sizeof(char*);

  http.begin(client, path);    //Specify request destination
  //  http.begin(path);
  http.addHeader("Content-Type", "application/json");  //Specify content-type header

  http.collectHeaders(headerkeys, headerkeyssize);

  int httpCode = http.POST("Hello");  //Send the request
  String payload = http.getString();                                        //Get the response payload

  if (httpCode > 0)
  {

    if (httpCode == HTTP_CODE_OK || httpCode == HTTP_CODE_MOVED_PERMANENTLY) {

      http.end();  //Close connection

      return true;

    } // end  payload != ""

    client.stop();
    http.end();  //Close connection

    return true;
  }

  Serial.print("httpCode:");
  Serial.println(httpCode, DEC );

  http.end();  //Close connection
  return false;
}

void setup() {
  // put your setup code here, to run once:

  WiFi.mode(WIFI_STA);

  EEPROM.begin(512);

  Serial.begin(115200);
connect2WiFI();
}

void loop() {
  // put your main code here, to run repeatedly:

  test();

  delay(500);
}

Debug Messages

16:33:26.338 -> 192.168.0.192 16:33:34.254 -> pm open,type:2 0 16:33:37.970 -> httpCode:-1

wireshark report : https://mega.nz/#!jJ4wRIyQ!6ayDTjoxEgaxO82yBo-0I4PwqH33TpPVC9XijBe2EXg

devyte commented 5 years ago

Please don't post links to external sources, especially to files uploaded, because there is no assurance the files will live in the future. Instead, reduce the log by filtering out unrelevant things (if you haven't already done so) and post the log directly here with markup.

jpboucher commented 5 years ago

You are probably running out of sockets after a while. Try calling http.setReuse(true) before you call http.begin

d-a-v commented 5 years ago

Same sketch is running flawlessly here. I think your server is busy for more than 0.5 seconds and the esp times out as you asked it to.

Failing request starts:

17:33:37.439458 IP 192.168.0.192.49882 > 192.168.0.31.8000: Flags [S], seq 6630, win 2144, options [mss 536,nop,nop,sackOK], length 0
17:33:37.439571 IP 192.168.0.31.8000 > 192.168.0.192.49882: Flags [S.], seq 1154906697, ack 6631, win 65392, options [mss 1460,nop,nop,sackOK], length 0

lots of https at the same time:

17:33:37.623692 IP 149.11.65.227.https > 192.168.0.31.49821: Flags [P.], seq 22745748:22745784, ack 1, win 34, length 36
17:33:37.623693 IP 149.11.65.227.https > 192.168.0.31.49821: Flags [P.], seq 22745784:22746194, ack 1, win 34, length 410
17:33:37.623694 IP 149.11.65.227.https > 192.168.0.31.49821: Flags [P.], seq 22746194:22746230, ack 1, win 34, length 36
... (lots)
17:33:37.868468 IP 192.168.0.31.49821 > 149.11.65.227.https: Flags [.], ack 22752711, win 1024, length 0
17:33:37.916391 IP 149.11.65.227.https > 192.168.0.31.49821: Flags [P.], seq 22752711:22753078, ack 1, win 34, length 367
17:33:37.916445 IP 192.168.0.31.49821 > 149.11.65.227.https: Flags [.], ack 22753078, win 1023, length 0

esp times out:

17:33:37.944117 IP 192.168.0.192.49882 > 192.168.0.31.8000: Flags [R.], seq 1, ack 3140060599, win 24584, length 0

next requests are honored:

...
17:33:38.446020 IP 192.168.0.192.50004 > 192.168.0.31.8000: Flags [S], seq 6642, win 2144, options [mss 536,nop,nop,sackOK], length 0
17:33:38.446099 IP 192.168.0.31.8000 > 192.168.0.192.50004: Flags [S.], seq 1775072509, ack 6643, win 65392, options [mss 1460,nop,nop,sackOK], length 0
17:33:38.448943 IP 192.168.0.192.50004 > 192.168.0.31.8000: Flags [.], ack 1, win 2144, length 0
17:33:38.454408 IP 192.168.0.192.50004 > 192.168.0.31.8000: Flags [P.], seq 1:206, ack 1, win 2144, length 205
17:33:38.454408 IP 192.168.0.192.50004 > 192.168.0.31.8000: Flags [P.], seq 206:211, ack 1, win 2144, length 5
17:33:38.454437 IP 192.168.0.31.8000 > 192.168.0.192.50004: Flags [.], ack 211, win 65182, length 0
17:33:38.455149 IP 192.168.0.31.8000 > 192.168.0.192.50004: Flags [P.], seq 1:231, ack 211, win 65182, length 230
17:33:38.455223 IP 192.168.0.31.8000 > 192.168.0.192.50004: Flags [F.], seq 231, ack 211, win 65182, length 0
17:33:38.469175 IP 192.168.0.192.50004 > 192.168.0.31.8000: Flags [R.], seq 211, ack 232, win 24584, length 0
...

Be nicer with your server, increase ESP timeout.

TD-er commented 5 years ago

@jpboucher

You are probably running out of sockets after a while. Try calling http.setReuse(true) before you call http.begin

I've looked into the code and it seems it does keep the connection open. What's the default for max. open connections? What will happen if that maximum is hit?

jpboucher commented 5 years ago

@jpboucher

You are probably running out of sockets after a while. Try calling http.setReuse(true) before you call http.begin

I've looked into the code and it seems it does keep the connection open. What's the default for max. open connections? What will happen if that maximum is hit?

The default is 5. It can be increased up to 15 using espconn_tcp_set_max_con() from the ESP SDK. If the maximum is hit, you will not be able to create a new socket and the connection will fail. The request will return HTTPC_ERROR_CONNECTION_REFUSED (-1)

d-a-v commented 5 years ago

I've looked into the code and it seems it does keep the connection open.

It doesn't. Variable scope implies the destructor is called and the connection closed at every loop (on test() ending, because WiFiClient and HTTPClient are declared in there).

The default is 5. It can be increased up to 15 using espconn_tcp_set_max_con() from the ESP SDK.

We don't depend on that. It's for espconn API. Your only limitation is RAM.

jpboucher commented 5 years ago

My bad. Since everything is indeed cleared at every loop I would also suspect that the server is too slow.

TD-er commented 5 years ago

It doesn't. Variable scope implies the destructor is called and the connection closed at every loop (on test() ending, because WiFiClient and HTTPClient are declared in there).

I meant the code of HTTPClient. So as long as the client object does exist, it will keep the connection open so it seems. (with _reuse set)

devyte commented 4 years ago

This was reported for 2.5.2. There have been several critical stability fixes merged since then. Please retest with 2.6.3 or latest git.

earlephilhower commented 4 years ago

Closing as last update was Jun 18, there have been many HTTPClient updates, and there's been no response to @devyte's request for a confirmation this is still an issue in 2.7+ for ~4 months. If it does still show up in latest master, please open a new bug with MCVE/etc. so we can look at it.

beicnet commented 4 years ago

@earlephilhower , @devyte The issue with -1 still persists, went today from 2.3.0 to 2.7.1, but it's got worse, now I have delay, freezing issues too.

esp_response_minus_one

This function is called every 10 seconds:

void checkForUpdates()
{
  if (WiFi.status() == WL_CONNECTED) {
    String fwVerURL = "";
    fwVerURL.concat("http://");
    fwVerURL.concat("xxxxxx.ddns.net");
    fwVerURL.concat(":");
    fwVerURL.concat("80");
    fwVerURL.concat("/xxx/version.asp");

    HTTPClient updHttp;
    updHttp.setReuse(true);
    updHttp.begin(fwVerURL);

    int httpCode = updHttp.GET();

    Serial.println("HTTP Respose Code: " + String(httpCode));

    if (httpCode == 200) {
      String newFwTmpVersion = updHttp.getString();
      // ===> Here I got 3 second whole ESP freezing <===
      Serial.println("HTTP Update Version: " + newFwTmpVersion);
    } else {
      Serial.println("Error HTTP Update!");
    }
    updHttp.end();
  }
}
earlephilhower commented 4 years ago

@beicnet, Can you please open a new issue with MCVE (and public file maybe on your GH acct or something so we can run it in our own environments) and also enable full debugging to get more useful logs? This is a closed issue from a long time ago, so it won't get attention otherwise.

beicnet commented 4 years ago

Hey @earlephilhower thank you for your fast response! ;)

MCVE? Huh, there are 12 library included in my project, and also there are a lot of modules attached to the WeMos D1 mini board.

Can you tell me which one is the full debug in Tools > Debug level?

I'm still using Arduino 1.8.2 with 2.3.0 and everything was fine except that HTTP random response -1, and -2 code (as we speak I got -11 too).

earlephilhower commented 4 years ago

If you're using 2.3.0 then first move to 2.7.1 before trying anything else. 2.3.0 is ancient.

For debug, just select the longest one (2nd from last in the menu, it's got about 12 things listed.

If it still has issues w/the latest version, then you'll have to strip it down to a simple example. If it's just the http stuff, you should be able to reproduce the failure with just a loop() that calls your update stuff. If you can't strip things down, again there's not much we can do, but you can use https://esp8266.com forums for more help.

beicnet commented 4 years ago

Yes, @earlephilhower I moved directly to 2.7.1.

Could not use 2nd from last in the menu, because I got BSSL error at compiling and I used 3th from the last in the menu.

I managed to strip down the whole code, and it's just the http stuff in the loop.

Stripped code and Debug information follows:

extern "C" {
#include "user_interface.h"
}

#include "ESP8266WiFi.h"
#include "ESP8266HTTPClient.h"

const char* ssid     = "xxxxxx";
const char* password = "xxxxxx";

const char* devhostname = "DEVICE-002";
uint8_t Mac[] = {0xBE, 0x1C, 0x00, 0x00, 0x01, 0x01};

//  Custom Delay of 1s in Loop procedure (Non-blocking)
int upd_ready_period = 10000;
unsigned long upd_ready_time_now = 0;

void setup()
{
  delay(10);

  Serial.begin(9600);

  while (!Serial)
  {
    delay(10);
  }

  Serial.println("Booting up...");

  //  Fix for "Can't reconnect to the router"
  WiFi.persistent(false);
  WiFi.mode(WIFI_OFF);
  WiFi.softAPdisconnect(true);
  WiFi.mode(WIFI_STA);
  //  =======================================

  //  Set Wifi MAC, Hostanme, Ssid and Passowrd
  wifi_set_macaddr(STATION_IF, &Mac[0]);
  WiFi.hostname(devhostname);
  WiFi.begin(ssid, password);
  Serial.print(F("Connecting to "));
  Serial.print(ssid); Serial.println(F(" ..."));

  int i = 0;

  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.print(++i);
    Serial.print(' ');
  }

  Serial.println("");
  Serial.println(F("Connected!"));
  Serial.println("SSID: " + String(WiFi.SSID()));
  Serial.println("RSSI: " + String(WiFi.RSSI()));
  Serial.println("Channel: " + String(wifi_get_channel()));
  Serial.println("IP Address: " + Ip2Str(WiFi.localIP()));
  Serial.println("MAC Address: " + Mac2Str(WiFi.macAddress(Mac)));
}

void loop()
{
  if (millis() > upd_ready_time_now + upd_ready_period)
  {
    upd_ready_time_now = millis();
    checkForUpdates();
  }
}

void checkForUpdates()
{
  if (WiFi.status() == WL_CONNECTED) {
    String fwVerURL = "";
    fwVerURL.concat("http://");
    fwVerURL.concat("xxxxxx.ddns.net");
    fwVerURL.concat(":");
    fwVerURL.concat("80");
    fwVerURL.concat("/xxxxxx/version.asp");

    HTTPClient updHttp;
    updHttp.setReuse(true);
    updHttp.begin(fwVerURL);

    int httpCode = updHttp.GET();

    Serial.println("HTTP Respose Code: " + String(httpCode));

    if (httpCode == 200) {
      String newFwTmpVersion = updHttp.getString();
      Serial.println("HTTP Update Version: " + newFwTmpVersion);
    } else {
      Serial.println("Error HTTP Update!");
    }
    updHttp.end();
  }
}

String Ip2Str(IPAddress ip)
{
  String s = "";
  for (int i = 0; i < 4; i++)
    s += i  ? "." + String(ip[i]) : String(ip[i]);
  return s;
}

String Mac2Str(byte ar[])
{
  String s = "";
  for (byte i = 0; i < 6; ++i) {
    char buf[3];
    sprintf(buf, "%02X", ar[i]);
    s += buf;
    if (i < 5) s += ':';
  }
  return s;
}

Debug information:

:ref 1
:wr 196 0
:wrc 196 196 0
:ack 196
:rn 249
:rch 249, 3
:rcl pb=0x3ffefbf4 sz=252
:c 1, 249, 252
HTTP Respose Code: 200
:c0 1, 3
HTTP Update Version: 3.6
:close
:ur 1
:dsrcv 0
:del
:ref 1
:wr 196 0
:wrc 196 196 0
:ack 196
:rn 249
:rch 249, 3
:rcl pb=0x3ffeff6c sz=252
:c 1, 249, 252
HTTP Respose Code: 200
:c0 1, 3
HTTP Update Version: 3.6
:close
:ur 1
:dsrcv 0
:del
:ref 1
:wr 196 0
:wrc 196 196 0
:ack 196
:rn 249
:rch 249, 3
:rcl pb=0x3ffeff6c sz=252
:c 1, 249, 252
HTTP Respose Code: 200
:c0 1, 3
HTTP Update Version: 3.6
:close
:ur 1
:dsrcv 0
:del
:ref 1
:wr 196 0
:wrc 196 196 0
:ack 196
:rn 249
:rch 249, 3
:rcl pb=0x3ffeff6c sz=252
:c 1, 249, 252
HTTP Respose Code: 200
:c0 1, 3
HTTP Update Version: 3.6
:close
earlephilhower commented 4 years ago

Looks like it's fine, no -1 response shown. You should look to capture a -1 code...

beicnet commented 4 years ago

@earlephilhower There is a 3 second total freeze between the ARROW shown:

HTTP Respose Code: 200
:c0 1, 3 <=== here 3 second freeze in every loop
HTTP Update Version: 3.6
earlephilhower commented 4 years ago

That's not an error. You'd need to go over the server it's connecting to and what it's transmitting (wireshark). As a guess I would say that the HTTP server is doing something funky here and not identifying the content size properly so the client waits a couple seconds until timeout and return of the string. Wireshark on the network to capture both sides of the transaction is your best bet to debugging this. That said, it seems like this is really something you're going to have to go to esp8266.com or https://gitter.im/esp8266/Arduino to track down since it's not really a core thing so far.

beicnet commented 4 years ago

Ok, just installed Wireshark, any suggestion for the filter and what should I look for? (first time with Wireshark) @earlephilhower

TD-er commented 4 years ago

Ok, just installed Wireshark, any suggestion for the filter and what should I look for? (first time with Wireshark) @earlephilhower

Just one thing to test first before experimenting with filter. Make sure you actually see the traffic of your node. So one filter to use at least is filter on the MAC of your ESP in src and dst direction. And just to be sure, maybe also add the same filter for your AP MAC of the ESP. (if it is enabled, as you're calling WiFi.softAPdisconnect(true);)

beicnet commented 4 years ago

Hey @TD-er I managed to filter traffic for 8008 port, the Source device (client) is 192.168.0.25 and the destination device (server) is 192.168.0.132

ws_8008_http

TD-er commented 4 years ago

The reason I mentioned it was that it is also possible the network packets do not reach the host you're running Wireshark on. A typical switch only forwards data to the ports that have the destination MAC address connected. So if you try to grab traffic between host A and B while running Wireshark on host C, then you may need to use some tricks to see that traffic also (e.g. port mirroring available in some switches)

But if you can see the packets using Wireshark, then you know you can trust the setup and start filtering.

beicnet commented 4 years ago

I don't know @TD-er , I'm new to Wireshark, no time to learn it, but what can I say, I went back to core 2.3.0 and it's working without any issues for now again.

What I noticed so far are those time latency and freezing with 2.7.1:

Core 2.7.1
30.005311 (Request - Start)
30.020311 (Respond - End)

Core 2.3.0
7272.479139  (Request - Start)
7272.494111 (Respond - End)

Used the same stripped sketch posted above.

TD-er commented 4 years ago

Just one thought. Core 2.3.0 didn't have chunked transfers, while 2.4.0 and newer have. As you can see in the Wireshark screenshot, you highlighted a packet with content length 0. That's signalling the end of a chunked transfer.

I can imagine that trying to get the resulting string may wait until the transfer is considered complete. If there is a bug (or a missed packet) in the code here that does not consider the chunked transfer to be complete, then you may run into a timeout.

Maybe it would help to also include the millis() in the log strings, so we can see where it is taking so long. Maybe also try to run a delay(0); before getting the string?

d-a-v commented 4 years ago

@beicnet why didn't you open a new issue ?

It's not 3 second but 5 second according to wireshark log (which is the default timeout in http client). @beicneit your http request does seem to produce any other answer than the 200/OK and esp is waiting for one during 5 seconds before giving up. What is the output of this command curl -D - http://your-url ?

beicnet commented 4 years ago

Sorry about that @d-a-v , I'm totally lost because of my job.

Requested CURL output:

curl -D - http://192.168.0.132:8008/inovi/update/version.asp

HTTP/1.0 200 OK
Server: Quick 'n Easy Web Server
Content-Type: text/plain
Cache-Control: no-Cache
Pragma: no-cache
Expires: Fri, 12 Jun 2020 17:05:51 GMT
Set-Cookie: SESSIONID=00002633; path=/;version=1
Date: Sat, 13 Jun 2020 15:05:51 GMT

3.6

ASP page source follows:

<%@ Language=VBScript %>
<%
 Option Explicit

 Response.Buffer=True
 Response.ContentType="text/plain" 
 Response.CacheControl = "no-Cache"
 Response.AddHeader "Pragma" , "no-cache"
 Response.ExpiresAbsolute = Now() -1

 Call Response.Write("3.6")

 Response.Flush
 Response.Clear
 Response.End
%>

and it's happening randomly using core 2.3.0

HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6
HTTP Respose Code: 200
HTTP Update Version: 3.6

Using latest core 2.7.1, it's freezes the whole sketch.

d-a-v commented 4 years ago

From what I can see it's not a chunked answer so you must have a Content-Length in your header

beicnet commented 4 years ago

Edited the source ASP page @d-a-v

curl -D - http://192.168.0.132:8008/inovi/update/version.asp

HTTP/1.0 200 OK
Server: Quick 'n Easy Web Server
Content-Type: text/plain
Cache-Control: no-Cache
Pragma: no-cache
Content-Length: 3
Expires: Fri, 12 Jun 2020 18:17:45 GMT
Set-Cookie: SESSIONID=00003261; path=/;version=1
Date: Sat, 13 Jun 2020 16:17:45 GMT

3.7

Result:

HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
earlephilhower commented 4 years ago

The -2 error == HTTPC_ERROR_SEND_HEADER_FAILED.

It means the client write of the headers did not write the full amount of headers to the server. See HTTPClient::sendHeader.... _client->write(<data>) != <data>

So, if Wifi goes down and isn't reconnected, that's one possibility. If the server sends a TCP disconnect (maybe it's rate-limiting or maybe it's slow), that's another possibility. If the WiFi signal isn't strong enough and the TCP ACK is lost, that's another option.

@d-a-v, can client->write fragment and only write a lesser portion of the data? If so, then the code here needs to be reworked to attempt multiple writes as long as the written bytes != 0... https://github.com/esp8266/Arduino/blob/89d0c78703e7a4bb627f9c69e237618add0f8de3/libraries/ESP8266HTTPClient/src/ESP8266HTTPClient.cpp#L1315

beicnet commented 4 years ago

@earlephilhower , @d-a-v I have some weird guessing here, is it possible that in the WeMos D1 mini devices the AT firmwares are not the same? Because I bought 150pcs of them what I currently testing, but I have one device running with the same code and it's rarely giving me response error code (maybe 1 time from 10000 loops) against these new one (like 15 times from 50 loops)?!

Is it possible or?

I don't know nothing anymore, struggling with this like more than 10 days...

HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: -2
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: -1
Error HTTP Update!
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
HTTP Respose Code: 200
HTTP Update Version: 3.7
TD-er commented 4 years ago

Try to set upd_ready_period to 10000 + millis() when you successfully connect to wifi. Or whatever value, as long as it includes millis() plus at least one second.

This way you know for sure it cannot be impacted by some modules needing more time to connect.

beicnet commented 4 years ago

Regarding my device RSSI is -76 @earlephilhower

@TD-er Did it as per your suggestion, but I got the same result right from the beginning:

int upd_ready_period = 10000 + millis();

Results:

HTTP Response Code: 200
HTTP Update Version: 3.7
HTTP Response Code: 200
HTTP Update Version: 3.7
HTTP Response Code: -2
Error HTTP Update!
HTTP Response Code: -1
Error HTTP Update!
HTTP Response Code: 200
HTTP Update Version: 3.7
d-a-v commented 4 years ago

@beicnet there is no different hardware. Can you enable the debug options ? Did you try to proceed to a flash full-erase ?

@earlephilhower ClientContext::write(,,len) is supposed to wait for the whole len to be sent because it's a blocking function unless a fatal error occurs.

Maybe we should add the result in the debug messages ?

 size_t ret = _client->write((const uint8_t*)header.c_str(), header.length();
 DEBUGV(..., header.length(), ret);
 return ret == header.length()); 
beicnet commented 4 years ago

@d-a-v

Yes > Can you enable the debug options ? Yes > Did you try to proceed to a flash full-erase ?

and let's wait a little bit for the output to populate a few cycle of loop...

beicnet commented 4 years ago
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:wr 193 0
:wrc 193 193 0
:ack 193
:rn 268
:rch 268, 3
:rcl pb=0x3fff045c sz=271
:c 1, 268, 271
HTTP Response Code: 200
:c0 1, 3
HTTP Update Version: 3.7
:close
:ur 1
:dsrcv 0
:del
:ref 1
:wr 193 0
:wrc 193 193 0
:ack 193
:rn 268
:rch 268, 3
:rcl pb=0x3fff053c sz=271
:c 1, 268, 271
HTTP Response Code: 200
:c0 1, 3
HTTP Update Version: 3.7
:close
:ur 1
:dsrcv 0
:del
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:wr 193 0
:wrc 193 193 0
:ack 193
:rn 268
:rch 268, 3
:rcl pb=0x3fff04dc sz=271
:c 1, 268, 271
HTTP Response Code: 200
:c0 1, 3
HTTP Update Version: 3.7
:close
:ur 1
:dsrcv 0
:del
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:wr 193 0
:wrc 193 193 0
:ack 193
:rn 268
:rch 268, 3
:rcl pb=0x3fff045c sz=271
:c 1, 268, 271
HTTP Response Code: 200
:c0 1, 3
HTTP Update Version: 3.7
:close
:ur 1
:dsrcv 0
:del
:ref 1
:wr 193 0
:wrc 193 193 0
:wustmo
:close
HTTP Response Code: -11
Error HTTP Update!
:ur 1
:dsrcv 0
:del
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:wr 193 0
:wrc 193 193 0
:wustmo
:close
HTTP Response Code: -11
Error HTTP Update!
:ur 1
:dsrcv 0
:del
:ref 1
:wr 193 0
:wrc 193 193 0
:wustmo
:close
HTTP Response Code: -11
Error HTTP Update!
:ur 1
:dsrcv 0
:del
:ref 1
:ctmo
:abort
:ur 1
:dsrcv 0
:del
HTTP Response Code: -1
Error HTTP Update!
:ref 1
:wr 193 0
:wrc 193 193 0
:ack 193
:rn 268
:rch 268, 3
:rcl pb=0x3fff08b4 sz=271
:c 1, 268, 271
HTTP Response Code: 200
:c0 1, 3
HTTP Update Version: 3.7
:close
:ur 1
:dsrcv 0
:del
:ref 1
beicnet commented 4 years ago

@earlephilhower @d-a-v Any news?!

d-a-v commented 4 years ago

:ctmo means Connect Timeout.

It seems that your webserver is not responding to incoming connections sometimes. Reported error is -1 "connection refused".

Also, error -11 = http read timeout following :wustmo is a slow tcp peer (your server).

void loop()
{
  if (millis() > upd_ready_time_now + upd_ready_period)
  {
    upd_ready_time_now = millis();
    checkForUpdates();
  }
}

What are the values ? Can you print millis() before and after checkForUpdates() ? Are you shaking your VB server too often ? If you were using arduino IDE, I'd also ask for the HTTPClient debug mode, and tagging each debug lines with time.

Did you try to use HTTPCLient::setTimeout() ?

beicnet commented 4 years ago

@d-a-v Could it be the 10s request time period? (there will be 150 devices with same request). Now I set the request period to be 25s, and now both prototype devices are working 24h without any errors shown.

What values? Is HTTPCLient::setTimeout() non blocking?

But I have another theory here...

d-a-v commented 4 years ago

Is HTTPCLient::setTimeout() non blocking?

HTTPClient is blocking during at most the timeout.

But I have another theory here...

My theory:

  // if (millis() > upd_ready_time_now + upd_ready_period)    <= wrong test
  if ((millis() - upd_ready_time_now > upd_ready_period)
  {
    upd_ready_time_now = millis(); // or upd_ready_time_now+=upd_ready_period
    checkForUpdates();
  }
beicnet commented 4 years ago

@d-a-v Nah, I think the NFC module is jamming the ESP module...

d-a-v commented 4 years ago

My theory is still good, try to change your if. And your theory is interesting, keep us informed. Is you chip shielded like an esp-12 ?

beicnet commented 4 years ago

Yes, @d-a-v I changed the suggested "if", but it's the same.

HTTP Response Code: -1
Error HTTP Update!
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: -1
Error HTTP Update!
HTTP Response Code: -2
Error HTTP Update!
HTTP Response Code: 200
HTTP Update Version: 3.8
HTTP Response Code: 200
HTTP Update Version: 3.8

If I comment out the rfid function from the loop then there is no issues whatsoever.

Yes, the chip is shielded!

screenshot

beicnet commented 4 years ago

@TD-er Can you please verify for me something?

TD-er commented 4 years ago

@beicnet What do you want me to verify?

beicnet commented 4 years ago

@TD-er Do you have a PN532 module by your side?

TD-er commented 4 years ago

Not sure, have to dig in the drawer(s) of sensors.

beicnet commented 4 years ago

@TD-er I think there is an incompatibility in Adafruits PN532 and HTTPClient library, but I'm not quite sure, because there is no other person to confirm it, that's why I asked you for the module and to test it in SPI mode.

I can give you the MCVE to test it if you would willing to do it for me.

You need WeMos D1 mini, PN532 module and a 6-pin Dupont cable.

earlephilhower commented 4 years ago

I've gone over this and what I get as the end result is that with a specific NFC sensor placed next to the 8266 is enabled, WiFi has issues which are reflected in HTTPClient returning (correctly) failure occasionally. I'm not seeing any core issue here, more of a RF or power supply one.

Given that, this is not the right place for a HW discussion like this, so I suggest moving to https://esp8266.com or https://gitter.im/esp8266/Arduino . Closing.

beicnet commented 4 years ago

@earlephilhower If you put the NFC reader to 10cm dupont cable, you got the same results. Did you test it or? How did you come to that conclusion?

beicnet commented 4 years ago

@earlephilhower @d-a-v I also tried across 20cm Dupont cable and the results are the same.

So, I don't think it's a hardware nor power supply issue.

I think it's library incompatibility (some sort of timing) issue.

You can read my full description on the forum.