knolleary / pubsubclient

A client library for the Arduino Ethernet Shield that provides support for MQTT.
http://pubsubclient.knolleary.net/
MIT License
3.82k stars 1.47k forks source link

PubSubClient::connect is blocking the main loop :( #583

Open gibman opened 5 years ago

gibman commented 5 years ago

hi..

Today I disovered that the call to PubSubClient::connect is blocking as my MQTT was down.

Each failed call to connect is taking 5sec. obvioulsy when the MQTT broker is running all is nice and behaving. But I want it to handle these unfortunate circumstances as well.

This means that if I have other time critical stuff going on in the main loop then Im out of luck.

Any ideas ?

knolleary commented 5 years ago

Connect is a blocking call. Sorry.

There is already an issue asking for nonblocking connect: https://github.com/knolleary/pubsubclient/issues/579

Not much else I can suggest.

Domochip commented 5 years ago

An intermediate solution would be to change the Timout of your Client to reduce this time of 5sec. ex : myWiFiClient.setTimeout(1000);

mlewand commented 5 years ago

@knolleary I believe that the first issue requesting for non-blocking connect feature is #147. I think it make sense to close this issue and #579 as duplicates of #147 and focus the discussion there.

spatnynick commented 5 years ago

It seems I successfully solved the blocking by using hw_timer and/or tickers, both is working. The main loop is blocked, I have simply moved my logic there https://circuits4you.com/2018/01/02/esp8266-timer-ticker-example/

mlewand commented 5 years ago

@spatnynick would you like to share the code?

spatnynick commented 5 years ago

Ok there is the code

Its a MOSFET LED dimmer for my strips in aquarium, controlled only via MQTT. Timing is fine-tuned for Wemos D1 mini. With other boards I guess this timing might need to be adjusted

I use hw_timer to dimm LEDs smoothly -> this processing should have constant runtime so the LED are not blinking In the ticker routine I do network communication with MQTT

All is green, up&running for a week now? ;)

#include <ESP8266WiFi.h>
#include <Ticker.h>  //Ticker Library
#include <PubSubClient.h>

#define TOPIC_CMD_LED_INTENSITY "ESPdimmer01/LEDintensity"
#define TOPIC_STAT_LED_INTENSITY "ESPdimmer01/stat/LEDintensity"

#define LED 5  // wemos D1
//#define LED 2 // esp01

#define STEP_TIME 455          // fine-tune this value; lower = dimmer less blinks
#define FADE_STEP 5;           // fine-tune this value; lower = slower diming (intensity change)

Ticker tickMaintain;

String ssid = <WIFI_SSID>;
String password = <WIFI_PASSWD>;
String mqttUname = <MQTT_USERNAME>; 
String mqttPasswd = <MQTT_PASSWD>;
const char* mqtt_server = <MQTT_IP>;

unsigned long mqttLastStatTime = 0;

WiFiClient espClient;
PubSubClient client(espClient);

unsigned long lastConnectivityCheckTime = 0;

volatile bool bLedStatus;
volatile int iLedIntensity;
volatile int iLedIntensity_target;
volatile int iLedLoop;

//=======================================================================
//                               ticker callback
//=======================================================================
void tckMaintain() {
  // this processing is not included in hw_timer to keep hw_timer processtime constant

  // from time to time send a status message to MQTT
  if (client.connected() && ( (mqttLastStatTime == 0) || ((millis() - mqttLastStatTime) > 90000)) )
  {
    char bPayload[5];
    String(iLedIntensity_target).toCharArray(bPayload, 5);
    if (client.publish(TOPIC_STAT_LED_INTENSITY, bPayload, true))
      mqttLastStatTime = millis();
  }
}

//=======================================================================
//                               hw_timer callback
//=======================================================================
void ICACHE_RAM_ATTR onTimerISR(){
  // to keep stable timing this coding should have +- constant runtime (no network communication etc)

  if ((!bLedStatus || (iLedIntensity == 100)) && (iLedIntensity != iLedIntensity_target)) {
    if (iLedIntensity > iLedIntensity_target) iLedIntensity--; else iLedIntensity++;
    if (iLedIntensity == iLedIntensity_target)
      Serial.println("LED target reached: " + String(iLedIntensity));
  }

  // if dimmer is not active (durned off or on fully) allow longer timer delay
  if (iLedIntensity == 0) {
    digitalWrite(LED, LOW);
    bLedStatus = false;
    timer1_write( 1000 * STEP_TIME);
  } else if (iLedIntensity == 100) {
    digitalWrite(LED, HIGH);
    bLedStatus = true;
    timer1_write( 1000 * STEP_TIME);
  } else {
    // dimmer is active; we have to blink correctly
    bLedStatus = !bLedStatus;

    digitalWrite(LED, bLedStatus);

    if (bLedStatus)
      timer1_write( iLedIntensity * STEP_TIME);
    else
      timer1_write((100 - iLedIntensity) * STEP_TIME);
  }

}

//=======================================================================
//                               MQTT message processing
//=======================================================================
void mqttCallback(char* topic, byte* payload, unsigned int length) {
  Serial.print("MQTT Message arrived [" + String(topic) + "] ");

  String sPayload_digits;
  String sPayload;
  int iLedIntensity_new;
  for (int i = 0; i < length; i++) {
    sPayload += (char)payload[i];
    if ((char)payload[i])
      sPayload_digits += (char)payload[i];
  }

  Serial.print("payload(string): [" + sPayload + "]: ");

  if (String(topic) == TOPIC_CMD_LED_INTENSITY) {
    if (sPayload_digits != "") {
      iLedIntensity_new = sPayload_digits.toInt();
      if (iLedIntensity_new < 0) iLedIntensity_new = 0;
      if (iLedIntensity_new > 100) iLedIntensity_new = 100;
      if (iLedIntensity_target == iLedIntensity_new)
        Serial.println("no change in intensity");
      else {
        Serial.println("changing intensity: " + String(iLedIntensity) + " => " + String(iLedIntensity_new) );
        iLedIntensity_target = iLedIntensity_new;
        mqttLastStatTime = 0;  // send update to MQTT asap = on next ticker processing
        digitalWrite(LED_BUILTIN, LOW); // blink internal led; we received the message
        delay(10);
        digitalWrite(LED_BUILTIN, HIGH);
      }
    } else {
      Serial.println("invalid payload");
    }
  } else {
      Serial.println("unknown topic");
  }

}

//=======================================================================
//                               Wifi...
//=======================================================================
int connectToWiFi() 
{
  delay(10);

  WiFi.softAPdisconnect (true);
  WiFi.mode(WIFI_STA);      // IMPORTANT! otherwise I had very unstable wifi, high packet loss

  WiFi.begin(ssid.c_str(), password.c_str());
  Serial.println("");
  Serial.println("WiFi reconnect (" + ssid + ")");
  mqttLastStatTime = 0;

  int i=0;
  while (WiFi.status() != WL_CONNECTED) 
  {
    if (i == 30) 
    {
      return -1;
    }
    delay(1000);
    Serial.print(".");
    i++;
  } 
  Serial.println("");
  Serial.println("Connected to " + ssid);
  Serial.println("IP address: ");
  Serial.println(WiFi.localIP());

  return 0;
}

//=======================================================================
//                               establish connection to MQTT
//=======================================================================
void connectToMQTT() {
  // Loop until we're reconnected
  while (!client.connected()) {
    mqttLastStatTime = 0;
    Serial.print("Attempting MQTT connection...");
    // Create a random client ID
    String clientId = "ESPdimm01-";
    clientId += String(random(0xffff), HEX);
    // Attempt to connect
    if (client.connect(clientId.c_str(), mqttUname.c_str(), mqttPasswd.c_str())) {
      Serial.println("connected");
      client.subscribe(TOPIC_CMD_LED_INTENSITY);
    } else {
      Serial.print("failed, rc=");
      Serial.print(client.state());
      Serial.println(" try again in 2 seconds");
      delay(2000);
    }
  }
}

//=======================================================================
//                               Setup
//=======================================================================
void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  Serial.println();

  WiFi.setSleepMode(WIFI_NONE_SLEEP);

  bLedStatus = false;
  iLedIntensity = 0;

  mqttLastStatTime = 0;
  iLedLoop = 0;

  pinMode(LED, OUTPUT);
  digitalWrite(LED, LOW);
  pinMode(LED_BUILTIN, OUTPUT);
  digitalWrite(LED_BUILTIN, HIGH); // Turn the LED off by making the voltage HIGH

  bLedStatus = digitalRead(LED);

  //Initialize Ticker
  timer1_attachInterrupt(onTimerISR);
  timer1_enable(TIM_DIV16, TIM_EDGE, TIM_SINGLE);
  timer1_write(200); //120000 us

  tickMaintain.attach(2, tckMaintain);

  randomSeed(micros());

  client.setServer(mqtt_server, 1883);
  client.setCallback(mqttCallback);
}

//=======================================================================
//                               Loop
//=======================================================================
void loop() {
  if(millis() - lastConnectivityCheckTime > 1000)
  {
    lastConnectivityCheckTime = millis();

    if(WiFi.status() != WL_CONNECTED) 
//    if(!WiFi.isConnected()) {
      connectToWiFi();
    }  

  }

  if(WiFi.status() == WL_CONNECTED)  {
    if (!client.connected()) {
      connectToMQTT();
    }
    client.loop();
  } else delay(500);

}
mlewand commented 5 years ago

Thanks for sharing @spatnynick! Blocking behavior is a really big issue for my projects as unresponsive device seems to be a broken device. I'll give it a try soon 🙂

diskgokey commented 5 years ago

Look at the ability of ESP32 with 2 cores. Standard program runs on core 1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

bobcroft commented 4 years ago

would this fix work on an ESP32?

bobcroft commented 4 years ago

diskgokey, could you share some example code around the MQTT functions please for the ESP32

zomco commented 4 years ago

I find an API setting connecting timeout defined in arduino-esp32/libraries/WiFi/src/WiFiClient.h

virtual int connect(const char *host, uint16_t port, int32_t timeout) = 0;

I replace Client::connect with ESPLwIPClient::connect in pubsubclient/src/PubSubClient.cpp to control connecting timeout

e.g. result = _client->connect(this->domain, this->port); --> result = _client->connect(this->domain, this->port, (int32_t) MQTT_SOCKET_TIMEOUT*1000UL);

It works

DietmarHoch commented 4 years ago

Hi, i have the same issue on my ESP32 Project

@zomco : could you please check if your changes are also works with the newest version. The Thing is, i cannot find the "Client::connect" in the PubSubClient.cpp, there are only "PubSubClient::connect" and its not working if i change line 127 to "result = _client->connect(this->ip, this->port, (int32_t) MQTT_SOCKET_TIMEOUT*1000UL);"

stay healthy

danbicks commented 4 years ago

Look at the ability of ESP32 with 2 cores. Standard program runs on core 1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000);
//delay(1000); // wait 10 seconds until we re try }
} ` Works like a dream :)

bobcroft commented 4 years ago

Hi Dan,

Thanks for posting this, I haven’t tried it yet but having faith in your success I am sure it will solve the MQTT reconnect issue. That has been a PITA to me causing various issues with managing program execution. Out of interest have you done anything else on core 0?

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 10:47 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Look at the ability of ESP32 with 2 cores. Standard program runs on core 1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000); //delay(1000); // wait 10 seconds until we re try } } ` Works like a dream :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629618687 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVDXIU3PEMKMVDKR635Y2LRRZOH5ANCNFSM4G43F72A .

danbicks commented 4 years ago

Hi Bob,

Thanks for your comments. I only stumbled on this issue the other day when my daughter un-plugged my node Red server running on a pi. All was working really well until then I realized the pusbutton to display different OLED information was not working and Icon for MQTT was down. After some quick debugging I found that client.connect was the blocking issue. Reading this forum gave me some pointers as to a solution.

I have designed an ESP-NOW network of devices, been working on this for some time now. Alarm, Controller, Environment nodes and a Gateway with structured payload data. My key points for this design was to store nodes within the "MESH" in Spiffs and restore them on restart without re scanning all the time. I have also an append mode which keeps previously stored nodes and appends any new ones found. So reading through these posts it became clear that a quick fix for the gateway would be to invoke MQTT connect task on core 0. This core is used for all Wifi related operation normally on the ESP32. Arduino code tends to run on core 1. Currently the blocking function has been moved to core o however I am looking at moving over other functionality to this to see how the device performs.

Hope this solves your problem. Works really well now.

Cheers

Dans

On Sat, May 16, 2020 at 12:19 PM bobcroft notifications@github.com wrote:

Hi Dan,

Thanks for posting this, I haven’t tried it yet but having faith in your success I am sure it will solve the MQTT reconnect issue. That has been a PITA to me causing various issues with managing program execution. Out of interest have you done anything else on core 0?

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 10:47 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Look at the ability of ESP32 with 2 cores. Standard program runs on core

  1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000); //delay(1000); // wait 10 seconds until we re try } } ` Works like a dream :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629618687> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAVDXIU3PEMKMVDKR635Y2LRRZOH5ANCNFSM4G43F72A> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629630177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5RNF7ZUCK6IBW2YX3JYLLRRZZF7ANCNFSM4G43F72A .

bobcroft commented 4 years ago

Hi again Dan,

Thank you for your very interesting reply. I have started some work with ESP-now as a proof of concept and my initial results are very good. I followed some stuff I found on line and built a module with a BME sensor as a sender and a second unit as a receiver. The sender essentially was battery powered and ‘slept’ most of the time, it woke up every 15 minutes and sent it data then went back to sleep. The receiver is mains powered and receives the BME data and then sends it by mqtt to Node-RED. However, in order to do that it has to switch from ESP-Now mode to ‘normal’ wireless mode to join my LAN. I could not find, a few weeks ago, when I last looked and way to have both modes simultaneously available. Does your gateway operate with ESP-Go and the normal WiFi simultaneously, if it does I would be very interested to see that code, if you are able to share it.

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 17:02 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Hi Bob,

Thanks for your comments. I only stumbled on this issue the other day when my daughter un-plugged my node Red server running on a pi. All was working really well until then I realized the pusbutton to display different OLED information was not working and Icon for MQTT was down. After some quick debugging I found that client.connect was the blocking issue. Reading this forum gave me some pointers as to a solution.

I have designed an ESP-NOW network of devices, been working on this for some time now. Alarm, Controller, Environment nodes and a Gateway with structured payload data. My key points for this design was to store nodes within the "MESH" in Spiffs and restore them on restart without re scanning all the time. I have also an append mode which keeps previously stored nodes and appends any new ones found. So reading through these posts it became clear that a quick fix for the gateway would be to invoke MQTT connect task on core 0. This core is used for all Wifi related operation normally on the ESP32. Arduino code tends to run on core 1. Currently the blocking function has been moved to core o however I am looking at moving over other functionality to this to see how the device performs.

Hope this solves your problem. Works really well now.

Cheers

Dans

On Sat, May 16, 2020 at 12:19 PM bobcroft notifications@github.com wrote:

Hi Dan,

Thanks for posting this, I haven’t tried it yet but having faith in your success I am sure it will solve the MQTT reconnect issue. That has been a PITA to me causing various issues with managing program execution. Out of interest have you done anything else on core 0?

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 10:47 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Look at the ability of ESP32 with 2 cores. Standard program runs on core

  1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000); //delay(1000); // wait 10 seconds until we re try } } ` Works like a dream :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629618687> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAVDXIU3PEMKMVDKR635Y2LRRZOH5ANCNFSM4G43F72A> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629630177, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5RNF7ZUCK6IBW2YX3JYLLRRZZF7ANCNFSM4G43F72A .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629667734 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVDXIWG3XO75IEDUMMTEETRR22F3ANCNFSM4G43F72A .

danbicks commented 4 years ago

Hi Bob,

Nice project you have sounds really cool. Yes my gateway has both ESPNOW communication and WIFI communicating with node red services at the same time. This was a real important design goal for me, I wanted no downtime between the 2. Unfortunately I can't share the code however I am more than happy to give you some tips on getting both working at the same time. Took me a little debugging to nail it all out. Key factor for both of these to works is to have the same WIFI channel as the router uses. I created a routine to scan for my router and return the channel. This then invokes ESPNOW to use the same and all other nodes respectively. There is a sequence to follow though to enable this all to work. Steps:

That is pretty much it although I did need to include another Wifi Disconnect and Set WIFI to (WIFI_AP_STA) in start Access point mode. Works really well since I have moved the client.connect to core 0. I also moved the client.loop() to this as well because it only makes sense to poll for messages if you are connected in the first place. This also seems to have cured a random PBUF restart every 5 hours.

Hope this helps, let me know how you get on.

Cheers

Dans

On Mon, May 18, 2020 at 2:18 PM bobcroft notifications@github.com wrote:

Hi again Dan,

Thank you for your very interesting reply. I have started some work with ESP-now as a proof of concept and my initial results are very good. I followed some stuff I found on line and built a module with a BME sensor as a sender and a second unit as a receiver. The sender essentially was battery powered and ‘slept’ most of the time, it woke up every 15 minutes and sent it data then went back to sleep. The receiver is mains powered and receives the BME data and then sends it by mqtt to Node-RED. However, in order to do that it has to switch from ESP-Now mode to ‘normal’ wireless mode to join my LAN. I could not find, a few weeks ago, when I last looked and way to have both modes simultaneously available. Does your gateway operate with ESP-Go and the normal WiFi simultaneously, if it does I would be very interested to see that code, if you are able to share it.

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 17:02 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Hi Bob,

Thanks for your comments. I only stumbled on this issue the other day when my daughter un-plugged my node Red server running on a pi. All was working really well until then I realized the pusbutton to display different OLED information was not working and Icon for MQTT was down. After some quick debugging I found that client.connect was the blocking issue. Reading this forum gave me some pointers as to a solution.

I have designed an ESP-NOW network of devices, been working on this for some time now. Alarm, Controller, Environment nodes and a Gateway with structured payload data. My key points for this design was to store nodes within the "MESH" in Spiffs and restore them on restart without re scanning all the time. I have also an append mode which keeps previously stored nodes and appends any new ones found. So reading through these posts it became clear that a quick fix for the gateway would be to invoke MQTT connect task on core 0. This core is used for all Wifi related operation normally on the ESP32. Arduino code tends to run on core 1. Currently the blocking function has been moved to core o however I am looking at moving over other functionality to this to see how the device performs.

Hope this solves your problem. Works really well now.

Cheers

Dans

On Sat, May 16, 2020 at 12:19 PM bobcroft notifications@github.com wrote:

Hi Dan,

Thanks for posting this, I haven’t tried it yet but having faith in your success I am sure it will solve the MQTT reconnect issue. That has been a PITA to me causing various issues with managing program execution. Out of interest have you done anything else on core 0?

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 10:47 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Look at the ability of ESP32 with 2 cores. Standard program runs on core

  1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000); //delay(1000); // wait 10 seconds until we re try } } ` Works like a dream :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629618687

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AAVDXIU3PEMKMVDKR635Y2LRRZOH5ANCNFSM4G43F72A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629630177 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AC5RNF7ZUCK6IBW2YX3JYLLRRZZF7ANCNFSM4G43F72A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629667734> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAVDXIWG3XO75IEDUMMTEETRR22F3ANCNFSM4G43F72A> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-630177107, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5RNFZFLM6SZUBJORSZSHDRSEYTZANCNFSM4G43F72A .

eku commented 4 years ago

@Demochip wrote:

An intermediate solution would be to change the Timout of your Client to reduce this time of 5sec.

This proposal is well-intentioned, but leads to other problems in practice. If you look at the definition of the TCP protocol, the timeout for a connect is 2 minutes and for good reason.

danbicks commented 4 years ago

Hi Erik, thanks for the suggestion however even 5 seconds will still block the main loop which is not an option. So this really is a non viable solution you have provided. Pinning this blocking task to core 0 works perfectly, no void loop downtime. The ideal solution is to have MQTT non blocking but as engineers we must find solutions. Regards Dans

On Tue, May 19, 2020 at 7:32 AM Erik Kunze notifications@github.com wrote:

@Demochip wrote:

An intermediate solution would be to change the Timout of your Client to reduce this time of 5sec.

This proposal is well-intentioned, but leads to other problems in practice. If you look at the definition of the TCP protocol, the timeout for a connect is 2 minutes and for good reason.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-630613200, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5RNF2OSRGXYIJZKTNCSBDRSIRWVANCNFSM4G43F72A .

bobcroft commented 4 years ago

Hi Dans,

Thank you for your response and offer of help. I am not a developer or anything like that, just a hobbyist with a life long interest in electronics / computing. I can’t actually claim much credit for the project I described as essentially I took code from several on line resources and fitted it together to achieve my goal. My next objective would be to get a gateway working the way yours does.

I can access my router and see what channel it is using easily, actually for 2.4 GHz it is channel 6.

When you say set WiFi to WiFi_AP_STA I assume you mean that mode in the ESP32 code.

Then start ESP-now,

then start access point mode with a prefixed MAC address and a strong, long password.

Then attempt the wifi connection, I that works try MQTT.

All the above in the ESP32 code.

I’ll try that as soon as I get chance.

As an aside another way I created a multiple sensor system was to use a Raspi Pi as a wireless AP with its own DHCP and DNS. This AP has a different IP range to the main LAN. My ESP8266 and ESP32 devices get an IP address from the AP via WiFi manager they can then communicate to my broker on my main LAN via an ethernet port on the Raspi AP. I can’t communicate with the devices on the AP LAN from the main but that doesn’t matter apart from I can’t use OTA. I wanted to create an outside LAN for external sensors, I have a high gain antenna to try with the AP to increase range. Getting the data into Node_RED and Home Assistant is now easy.

Kind regards

Bob

From: danbicks notifications@github.com Sent: 18 May 2020 16:28 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Hi Bob,

Nice project you have sounds really cool. Yes my gateway has both ESPNOW communication and WIFI communicating with node red services at the same time. This was a real important design goal for me, I wanted no downtime between the 2. Unfortunately I can't share the code however I am more than happy to give you some tips on getting both working at the same time. Took me a little debugging to nail it all out. Key factor for both of these to works is to have the same WIFI channel as the router uses. I created a routine to scan for my router and return the channel. This then invokes ESPNOW to use the same and all other nodes respectively. There is a sequence to follow though to enable this all to work. Steps:

That is pretty much it although I did need to include another Wifi Disconnect and Set WIFI to (WIFI_AP_STA) in start Access point mode. Works really well since I have moved the client.connect to core 0. I also moved the client.loop() to this as well because it only makes sense to poll for messages if you are connected in the first place. This also seems to have cured a random PBUF restart every 5 hours.

Hope this helps, let me know how you get on.

Cheers

Dans

On Mon, May 18, 2020 at 2:18 PM bobcroft notifications@github.com wrote:

Hi again Dan,

Thank you for your very interesting reply. I have started some work with ESP-now as a proof of concept and my initial results are very good. I followed some stuff I found on line and built a module with a BME sensor as a sender and a second unit as a receiver. The sender essentially was battery powered and ‘slept’ most of the time, it woke up every 15 minutes and sent it data then went back to sleep. The receiver is mains powered and receives the BME data and then sends it by mqtt to Node-RED. However, in order to do that it has to switch from ESP-Now mode to ‘normal’ wireless mode to join my LAN. I could not find, a few weeks ago, when I last looked and way to have both modes simultaneously available. Does your gateway operate with ESP-Go and the normal WiFi simultaneously, if it does I would be very interested to see that code, if you are able to share it.

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 17:02 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Hi Bob,

Thanks for your comments. I only stumbled on this issue the other day when my daughter un-plugged my node Red server running on a pi. All was working really well until then I realized the pusbutton to display different OLED information was not working and Icon for MQTT was down. After some quick debugging I found that client.connect was the blocking issue. Reading this forum gave me some pointers as to a solution.

I have designed an ESP-NOW network of devices, been working on this for some time now. Alarm, Controller, Environment nodes and a Gateway with structured payload data. My key points for this design was to store nodes within the "MESH" in Spiffs and restore them on restart without re scanning all the time. I have also an append mode which keeps previously stored nodes and appends any new ones found. So reading through these posts it became clear that a quick fix for the gateway would be to invoke MQTT connect task on core 0. This core is used for all Wifi related operation normally on the ESP32. Arduino code tends to run on core 1. Currently the blocking function has been moved to core o however I am looking at moving over other functionality to this to see how the device performs.

Hope this solves your problem. Works really well now.

Cheers

Dans

On Sat, May 16, 2020 at 12:19 PM bobcroft notifications@github.com wrote:

Hi Dan,

Thanks for posting this, I haven’t tried it yet but having faith in your success I am sure it will solve the MQTT reconnect issue. That has been a PITA to me causing various issues with managing program execution. Out of interest have you done anything else on core 0?

Kind regards

Bob

From: danbicks notifications@github.com Sent: 16 May 2020 10:47 To: knolleary/pubsubclient pubsubclient@noreply.github.com Cc: bobcroft rdg3@lineone.net; Comment comment@noreply.github.com Subject: Re: [knolleary/pubsubclient] PubSubClient::connect is blocking the main loop :( (#583)

Look at the ability of ESP32 with 2 cores. Standard program runs on core

  1. But you are able to run specific tasks on core 0. https://randomnerdtutorials.com/esp32-dual-core-arduino-ide/ Did it and it works great. Now reduced my watchdog timer of the main program to just a few ms...

Here is the solution. Pin task to core 0

`Global Scope TaskHandle_t Task1; // core 0 task handler Void Setup add xTaskCreatePinnedToCore(MQTT_TASK,"Task1",10000,NULL,1,&Task1,0);

create function for task // MQTT task running on Core 0 void MQTT_TASK(void * pvParameters){ for(;;){ if (FL_WiFi_Network_Connected && !client.connected()) { MQTTconnect(); } vTaskDelay(1000); //delay(1000); // wait 10 seconds until we re try } } ` Works like a dream :)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629618687

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AAVDXIU3PEMKMVDKR635Y2LRRZOH5ANCNFSM4G43F72A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629630177 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AC5RNF7ZUCK6IBW2YX3JYLLRRZZF7ANCNFSM4G43F72A

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/knolleary/pubsubclient/issues/583#issuecomment-629667734> , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAVDXIWG3XO75IEDUMMTEETRR22F3ANCNFSM4G43F72A> .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-630177107, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC5RNFZFLM6SZUBJORSZSHDRSEYTZANCNFSM4G43F72A .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/583#issuecomment-630258593 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAVDXIUC3LZMSPE4ZCMZCDLRSFHYDANCNFSM4G43F72A .

bleckers commented 4 years ago

To be honest, there is always going to some form of blocking because of the networking tasks (for example if the network goes down it will wait for timeout, or with the Ethernet library, just lock up completely), unless you use some sort of scheduler or threading.

I've been using this very successfully on the Teensy - https://github.com/ftrias/TeensyThreads/blob/master/TeensyThreads.cpp

All my networking/mqtt tasks run in a background thread and there are zero issues with these blocking the main tasks (updating a very resource intensive LED matrix display and reading sensor data).

Just be sure to mutex any and all hardware resources across threads and deal with that appropriately.

kamshory commented 2 years ago

I put the PubSubClient:connect on separated task (multitasking) and it works for me My repo is https://github.com/kamshory/OTP-Mini

Frtrillo commented 11 months ago

An intermediate solution would be to change the Timout of your Client to reduce this time of 5sec. ex : myWiFiClient.setTimeout(1000);

Btw setTimeOut is in seconds, so this would be 1000 seconds, instead do something like myWiFiClient.setTimeout(4); which is 4 seconds.