knolleary / pubsubclient

A client library for the Arduino Ethernet Shield that provides support for MQTT.
http://pubsubclient.knolleary.net/
MIT License
3.78k stars 1.46k forks source link

Reconnect Problem. #1047

Open tmsd2001 opened 3 months ago

tmsd2001 commented 3 months ago

I've been seeing this problem for a long time and I'm surprised it hasn't been mentioned yet, but maybe I'm thinking wrong. If the MQTT connection is lost and a reconnect is carried out, there is a loop until the connection is back, but if the connection is interrupted because the WLAN connection is missing, you cannot get out of the loop. I ask within the reconnect whether the WiFi connection is established and if not, I first jump into the WiFi reconnect.

jescarri commented 3 months ago

in the sketch loop() function I've seen it happening, I assumed it was a check the program had to do:

  1. Check if WiFi is active then carry on with network operations
    if Wireless is connected then: 
    connect_to_mqtt with X retries
    if mqtt_connected: then
       do all sketch operations that require network and mqtt.
    end if
    else
    deep_sleep for some time
    endif
PhySix66 commented 3 months ago

I've a simmilar issue. When I want to change my MQTT settings via HTTP and reenable/restart the service my ESP8266 reboots.

I've traced the source of the crash to this:

boolean PubSubClient::connected() 
{
    ...
    else {  
        rc = (int)_client->connected(); // ERROR Here If ReConnecting ->(?) Client::connected()
        // this is the code where it crashes

        //Fix Test 1 
        // rc = (int)_client[0].connected(); // Tryed this as well, not good
        //Fix Test 2                            // Not good, fails
        /*
        if((int)_client->connected() > 0){
            rc = true;
            //Serial.println(F("_client[0].connected() == true"));
        }else{
            rc = false;
            //Serial.println(F("_client[0].connected() == false"));
        }*/
    ...
    }
    return rc;
}

Specificly: rc = (int)_client->connected();

If my deductions are corrent than this points to this Library:

In Client.h

class Client: public Stream {
    public:
    ...
    virtual uint8_t connected() = 0;    //  No idea where this points to
    // this is the end for my rabithole
    ...

};

But after this I was unable to go further.

Had a thought that I may have overused the number of avaliable TCP/UDP connections, so I've disabled DDNS and my RUDP connection but no change.

abdosn commented 2 months ago

@PhySix66
in my sketch i use this

WiFiClient G_espClient; 
PubSubClient G_PubSubClient(G_espClient);

when you create PubSubClient object, you pass the 'WiFiClient' object

So

virtual uint8_t connected() = 0; // No idea where this points to

This not the function that is called In WiFiClient class there is a function uint8_t WiFiClient::connected() that overrides the function in Client class

which is

uint8_t WiFiClient::connected()
{
    if (!_client || _client->state() == CLOSED)
        return 0;
  return _client->state() == ESTABLISHED || available();
}
PhySix66 commented 2 months ago

Thanks for the Information. This was my 2nd or 3rd guess.

Still it didn't bring me closer to the solution.

The problem is the same. When I try to restart/reinit the MQTT, that I use/call via Home Assistaint library, the ESP reboots at: ''' boolean PubSubClient::connected() { .... if((int)_client->connected() > 0){ } ... } ''' I've also added some debug Serial Print text to the WifiClient.cpp ''' uint8_t WiFiClient::connected() {

ifdef WFC_DBG

if(wfc_dbg & WFC_DBG){ Serial.println(F("WiFiClient::connected()")); }

endif

if (!_client || _client->state() == CLOSED) {

ifdef WFC_DBG

if(wfc_dbg & WFC_DBG){ Serial.println(F("WiFiClient::connected() == false")); }

endif

    return 0;

}

ifdef WFC_DBG

if(wfc_dbg & WFC_DBG){ Serial.println(F("WiFiClient::connected() == true")); Serial.print(F("_client->state(): ")); Serial.println((int)_client->state()); }

endif

return _client->state() == ESTABLISHED || available();

} '''

On the first run even these texts are printed: WiFiClient::connected() WiFiClient::connected() == false

I'm pretty sure it's the enemy of most (or mostly beginner) programmers: THE POINTER

Abdelrahman Sobhy @.***> ezt írta (időpont: 2024. ápr. 16., K, 13:38):

@PhySix66 https://github.com/PhySix66 in my sketch i use this

''' WiFiClient G_espClient; PubSubClient G_PubSubClient(G_espClient); ''' when you create PubSubClient object, you pass the 'WiFiClient' object

So

virtual uint8_t connected() = 0; // No idea where this points to This not the function that is called In WiFiClient class there is a function 'uint8_t WiFiClient::connected()' that overrides the function in 'Client' class

which is ''' uint8_t WiFiClient::connected() { if (!_client || _client->state() == CLOSED) return 0;

return _client->state() == ESTABLISHED || available();

} '''

— Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/1047#issuecomment-2058884110, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3KOQBVOESH7MBEB2AQFWULY5UEUNAVCNFSM6AAAAABFBWHHYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJYHA4DIMJRGA . You are receiving this because you were mentioned.Message ID: @.***>

abdosn commented 2 months ago

I think when you're restarting mqtt something happen to _client inside PubSubClient object - maybe gets destroyed or becomes null - try to check on that variable

if it's available share the code where you're restarting MQTT

I suggest using stack dump tool to know on what line your code stopped before reset

PhySix66 commented 2 months ago

"Destruction it is."

In boolean PubSubClient::connected() Foo I already had this:

boolean PubSubClient::connected() {

ifdef PSC_DBG

if(psc_dbg & PSC_DBG){ Serial.println(F("PubSubClient::connected()")); }

endif

boolean rc; if (_client == NULL ) {

ifdef PSC_DBG

if(psc_dbg & PSC_DBG){ Serial.println(F("_client == NULL")); // << never got this line }

endif

rc = false; } else {

ifdef PSC_DBG

if(psc_dbg & PSC_DBG){ Serial.println(F("} else {")); }

endif

if((int)_client->connected() > 0){ // << ERROR Here If ReConnecting rc = true;

ifdef PSC_DBG

if(psc_dbg & PSC_DBG){ Serial.println(F("_client->connected() == true")); }

endif

}else{ rc = false;

ifdef PSC_DBG

if(psc_dbg & PSC_DBG){ Serial.println(F("_client->connected() == false")); }

endif

} ... }

But never got the "_client == NULL" on the Serial.print().

Some guidance: _client is the private var in PubSubClient &client is the var sent as argument in HAMqtt::verifyClient(Client& client) (Can only hope that I've not mixed them up at somepoint..)

To verify this, I made this debug_foo (seems to work, but not sure): void PubSubClient::verifyClient(Client& client) { Serial.println(F("PubSubClient::verifyClient(Client& client)")); if(_client == NULL) { Serial.println(F("_client == NULL")); } else if(&client == NULL) { Serial.println(F("&client == NULL")); } else if(_client == &client) { Serial.println(F("_client == &client")); Serial.print(F("&client: ")); Serial.println(client); Serial.print(F("_client: ")); Serial.println(_client); } else { Serial.println(F("_client != &client")); Serial.print(F("&client: ")); Serial.println(client); Serial.print(F("_client: ")); Serial.println(_client); } }

From HAMqtt.cpp the call is made by this foo():

void HAMqtt::verifyClient(Client& client) { _mqtt->verifyClient(client); }

On first run, the Serial output is something like this: PubSubClient::verifyClient(Client& client) _client == &client &client: WiFiClient::connected() WiFiClient::connected() == false 0 _client: WiFiClient::connected() WiFiClient::connected() == false 0

This is OK.

On disconnecting the Serial output: PubSubClient::verifyClient(Client& client) _client == &client &client: 1 _client: 1 MQTT current State: 0 AHA: disconnecting AHA: MQTT state changed to -1, previous state: 0 AHA: MQTT disconnected setState(ConnectionState state) MQTT current State: -1 PubSubClient::verifyClient(Client& client) _client == &client &client: 0 _client: 0 MQTT current State: -1 MQTT DisConnected

This is odd, it only prints out numbers for the clients.

And on Reconnect: PubSubClient::verifyClient(Client& client) _client != &client &client: WiFiClient::connected() WiFiClient::connected() == false 0 _client: --------------- CUT HERE FOR EXCEPTION DECODER ---------------

Exception (9): epc1=0x4021845a epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000001 depc=0x00000000

stack>>>

ctx: cont sp: 3ffffdf0 end: 3fffffd0 offset: 0150 3fffff40: 3fff1b1c 0000000e 3fff2d0c 3ffe8661 3fffff50: 3fff1b1c 3fff2d0c 3ffe865a 4020a4d0 3fffff60: 3fff38a4 4020c540 402210a8 418db751 3fffff70: 3fffdad0 3fff4100 3fff5cc4 3fff2f30 3fffff80: 3fffdad0 3fff2968 3fff2f58 4020a564 3fffff90: 3fffdad0 00000000 3fff2f58 4020ace9 3fffffa0: 3fffdad0 00000000 3fff2f04 615783a7 3fffffb0: 00000000 00000000 3fff2f04 4021dcc4 3fffffc0: feefeffe feefeffe 3fffdab0 4010140d <<<stack<<<

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

So the &client (client sent as argument) is OK, but _client is missing... Now the output of the exception decoder has changed. It references the startMQTT() foo at the line where HAMqtt::verifyClient(Client& client) is called. No wonder that is crashes here, because I'm trying to call _client that is destroyed.

Trying to find where this var gets destroyed... Or maybe adding a new foo. Something on the line of: void PubSubClient::reloadClient(Client& client) { if(_client != &client) { PubSubClient(client); } }

Abdelrahman Sobhy @.***> ezt írta (időpont: 2024. ápr. 17., Sze, 15:08):

I think when you're restarting mqtt something happen to _client inside PubSubClient object - maybe gets destroyed or becomes null - try to check on that variable

if it's available share the code where you're restarting MQTT

I suggest using stack dump tool https://arduino-esp8266.readthedocs.io/en/latest/Troubleshooting/stack_dump.html to know on what line your code stopped before reset

— Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/1047#issuecomment-2061223717, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3KOQBVPPA7ZZIIBW6FZWVDY5ZX5PAVCNFSM6AAAAABFBWHHYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRRGIZDGNZRG4 . You are receiving this because you were mentioned.Message ID: @.***>

abdosn commented 2 months ago

It's not necessary to be null to give an exception could be some malloced memory then freed or something

Also according to Kolban book - which is a very good reference if you're dealing with ESP - Exception 9 is LoadStoreAlignmentCause so it's memory thing i guess

the problem is _client as you commented i think when you're restarting service the pointer to WiFiClient object is changed or get freed

so give this a try

recreate an object to WiFiClient and use the function PubSubClient::setclient() in your reconnecting function

PhySix66 commented 2 months ago

Short version: PubSubClient::setclient() helps, but now crash happens at somewhere in beginPublish(topic, payloadLength, retained); when stat_t should be sent or right at the next foo().

So it's reasonable to think that more gets destroyed/freed up/ reallocated than just the _client var....

Abdelrahman Sobhy @.***> ezt írta (időpont: 2024. ápr. 18., Cs, 22:11):

It's not necessary to be null to give an exception could be some malloced memory then freed or something

Also according to Kolban book - which is a very good reference if you're dealing with ESP - Exception 9 is LoadStoreAlignmentCause so it's memory thing i guess

the problem is _client as you commented i think when you're restarting service the pointer to WiFiClient object is changed or get freed

so give this a try

recreate an object to WiFiClient and use the function PubSubClient::setclient() in your reconnecting function

— Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/1047#issuecomment-2065200694, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3KOQBSSVJMQXZ2OMXJ746DY6ASGTAVCNFSM6AAAAABFBWHHYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRVGIYDANRZGQ . You are receiving this because you were mentioned.Message ID: @.***>

PhySix66 commented 2 months ago

Got very confused, so I went back to square one.

Did a clean reinstall of the arduino-home-assistant https://github.com/dawidchyrzynski/arduino-home-assistant library and performed a simple test.

void DoStuffEveryMin() { if(rtcpreMin != rtcMin) { rtcpreMin = rtcMin; // other code is executed here if(WiFi.isConnected()) { if(rtcMin % 2 == 0) { if(mqtt.isConnected()) { mqtt_flags &= ~(MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT); } mqtt_flags |= MQTT_FLAG_EN; } } } }

void startMQTT() { if(mqtt_flags & MQTT_FLAG_EN) { if(!(AreBitSet(mqtt_flags, (MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT)))) { Serial.println(F("mqtt.h: startMQTT() Not Inited()"));

     IPAddress tempIP(0,0,0,0);
     if(!WiFi.hostByName(BROKER_ADDR, tempIP, 800)) {
      Serial.print(F("MQTT Req DNS lookup failed for "));

Serial.println(BROKER_ADDR); mqtt_flags &= ~MQTT_FLAG_SERVER_AV; } else { Serial.print(F("MQTT Req DNS lookup Success for ")); Serial.println(BROKER_ADDR); Serial.print(F("IP Addr is ")); Serial.println(tempIP);

        mqtt_flags |= MQTT_FLAG_SERVER_AV;
     }

    if(mqtt_flags & MQTT_FLAG_SERVER_AV)
    {
      InitMQTT_Device();
      InitMQTT_Switches();

      if(mqtt_flags & MQTT_FLAG_USE_CREDENTIALS)
      {
        // use this for mqtt-with-credentials
        mqtt.begin(MQTT_ServerName, mqtt_port, mqtt_server_user_name,

mqtt_server_password); } else { if(mqtt.begin(MQTT_ServerName) == true) { Serial.println(F("mqtt.begin == true")); } else { Serial.println(F("mqtt.begin == false")); } } mqtt_flags |= MQTT_FLAG_INIT; Serial.print(F("mqtt_flags: ")); Serial.println(mqtt_flags); Serial.println(F("MQTT is Inited")); } } } }

The startMQTT() is running in the main loop().

This setup results in the same crash... Strange part in this is that mqtt.begin()-s have basically a failsave(?) that does not allow it to fully execute while it's been initialized:

if (_initialized) { ARDUINOHA_DEBUG_PRINTLN(F("AHA: already initialized")) return false; }

What on earth is going on here...

Apocalypse @.***> ezt írta (időpont: 2024. ápr. 22., H, 11:49):

Short version: PubSubClient::setclient() helps, but now crash happens at somewhere in beginPublish(topic, payloadLength, retained); when stat_t should be sent or right at the next foo().

So it's reasonable to think that more gets destroyed/freed up/ reallocated than just the _client var....

Abdelrahman Sobhy @.***> ezt írta (időpont: 2024. ápr. 18., Cs, 22:11):

It's not necessary to be null to give an exception could be some malloced memory then freed or something

Also according to Kolban book - which is a very good reference if you're dealing with ESP - Exception 9 is LoadStoreAlignmentCause so it's memory thing i guess

the problem is _client as you commented i think when you're restarting service the pointer to WiFiClient object is changed or get freed

so give this a try

recreate an object to WiFiClient and use the function PubSubClient::setclient() in your reconnecting function

— Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/1047#issuecomment-2065200694, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3KOQBSSVJMQXZ2OMXJ746DY6ASGTAVCNFSM6AAAAABFBWHHYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRVGIYDANRZGQ . You are receiving this because you were mentioned.Message ID: @.***>

PhySix66 commented 2 months ago

Solved!

Everything works fine now. The problem was: ME.

In the startMQTT() I've added these two function: void InitMQTT_Device() { device.setUniqueId(esp_mac.b, sizeof(esp_mac)); Serial.print(F("MQTT UniqueID :")); Serial.println(device.getUniqueId());

      // set device's details (optional)
      //device.setName(ESP_HostName.c_str());
      device.setName(ESP_HostName);
      Serial.print(F("ESP_HostName :")); Serial.println(ESP_HostName);
      device.setManufacturer("PhySix66");
      device.setModel(esp_model_name);
      device.setSoftwareVersion("0.0.1");

      // This method enables availability for all device types

registered on the device. // For example, if you have 5 sensors on the same device, you can enable // shared availability and change availability state of all sensors using // single method call "device.setAvailability(false|true)" //device.enableSharedAvailability();

      // Optionally, you can enable MQTT LWT feature. If device will

lose connection // to the broker, all device types related to it will be marked as offline in // the Home Assistant Panel. //device.enableLastWill(); }

void InitMQTT_Switches() { // handle switch state (multi-switch) - OutPut via MCP23x17 switch0.setName("Switch0"); //switch0.setIcon("mdi:lightbulb"); switch0.setIcon("mdi:toggle-switch"); //switch0.setRetain(true); // Sets retain flag for the switch command. If set to true the command produced by Home Assistant will be retained. switch0.onCommand(onSwitchCommand);

      switch1.setName("Switch1");
      switch1.setIcon("mdi:toggle-switch");
      switch1.onCommand(onSwitchCommand);

      switch2.setName("Switch2");
      switch2.setIcon("mdi:toggle-switch");
      switch2.onCommand(onSwitchCommand);

      switch3.setName("Switch3");
      switch3.setIcon("mdi:toggle-switch");
      switch3.onCommand(onSwitchCommand);

      switch4.setName("Switch4");
      switch4.setIcon("mdi:toggle-switch");
      switch4.onCommand(onSwitchCommand);

      switch5.setName("Switch5");
      switch5.setIcon("mdi:toggle-switch");
      switch5.onCommand(onSwitchCommand);

      switch6.setName("Switch6");
      switch6.setIcon("mdi:toggle-switch");
      switch6.onCommand(onSwitchCommand);

      switch7.setName("Switch7");
      switch7.setIcon("mdi:toggle-switch");
      switch7.onCommand(onSwitchCommand);

}

Every time I've ReInitialized the MQTT these would also init, and possibly mess up some allocated memory, I think. Don't know how. So it was not the disconnect()-s fault.

Thanks for the help. In a day or two I'm gona update my github comment.

Apocalypse @.***> ezt írta (időpont: 2024. ápr. 26., P, 22:46):

Got very confused, so I went back to square one.

Did a clean reinstall of the arduino-home-assistant https://github.com/dawidchyrzynski/arduino-home-assistant library and performed a simple test.

void DoStuffEveryMin() { if(rtcpreMin != rtcMin) { rtcpreMin = rtcMin; // other code is executed here if(WiFi.isConnected()) { if(rtcMin % 2 == 0) { if(mqtt.isConnected()) { mqtt_flags &= ~(MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT); } mqtt_flags |= MQTT_FLAG_EN; } } } }

void startMQTT() { if(mqtt_flags & MQTT_FLAG_EN) { if(!(AreBitSet(mqtt_flags, (MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT)))) { Serial.println(F("mqtt.h: startMQTT() Not Inited()"));

     IPAddress tempIP(0,0,0,0);
     if(!WiFi.hostByName(BROKER_ADDR, tempIP, 800)) {
      Serial.print(F("MQTT Req DNS lookup failed for "));

Serial.println(BROKER_ADDR); mqtt_flags &= ~MQTT_FLAG_SERVER_AV; } else { Serial.print(F("MQTT Req DNS lookup Success for ")); Serial.println(BROKER_ADDR); Serial.print(F("IP Addr is ")); Serial.println(tempIP);

        mqtt_flags |= MQTT_FLAG_SERVER_AV;
     }

    if(mqtt_flags & MQTT_FLAG_SERVER_AV)
    {
      InitMQTT_Device();
      InitMQTT_Switches();

      if(mqtt_flags & MQTT_FLAG_USE_CREDENTIALS)
      {
        // use this for mqtt-with-credentials
        mqtt.begin(MQTT_ServerName, mqtt_port, mqtt_server_user_name,

mqtt_server_password); } else { if(mqtt.begin(MQTT_ServerName) == true) { Serial.println(F("mqtt.begin == true")); } else { Serial.println(F("mqtt.begin == false")); } } mqtt_flags |= MQTT_FLAG_INIT; Serial.print(F("mqtt_flags: ")); Serial.println(mqtt_flags); Serial.println(F("MQTT is Inited")); } } } }

The startMQTT() is running in the main loop().

This setup results in the same crash... Strange part in this is that mqtt.begin()-s have basically a failsave(?) that does not allow it to fully execute while it's been initialized:

if (_initialized) { ARDUINOHA_DEBUG_PRINTLN(F("AHA: already initialized")) return false; }

What on earth is going on here...

Apocalypse @.***> ezt írta (időpont: 2024. ápr. 22., H, 11:49):

Short version: PubSubClient::setclient() helps, but now crash happens at somewhere in beginPublish(topic, payloadLength, retained); when stat_t should be sent or right at the next foo().

So it's reasonable to think that more gets destroyed/freed up/ reallocated than just the _client var....

Abdelrahman Sobhy @.***> ezt írta (időpont: 2024. ápr. 18., Cs, 22:11):

It's not necessary to be null to give an exception could be some malloced memory then freed or something

Also according to Kolban book - which is a very good reference if you're dealing with ESP - Exception 9 is LoadStoreAlignmentCause so it's memory thing i guess

the problem is _client as you commented i think when you're restarting service the pointer to WiFiClient object is changed or get freed

so give this a try

recreate an object to WiFiClient and use the function PubSubClient::setclient() in your reconnecting function

— Reply to this email directly, view it on GitHub https://github.com/knolleary/pubsubclient/issues/1047#issuecomment-2065200694, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3KOQBSSVJMQXZ2OMXJ746DY6ASGTAVCNFSM6AAAAABFBWHHYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRVGIYDANRZGQ . You are receiving this because you were mentioned.Message ID: @.***>

PhySix66 commented 2 months ago

Problem Solved! Problem Source: USER (a.k.a ME)

I've made a bad, untested/unverified assumption based from the Arduinos Examples, namely: example/home-assistaint-integration/multi-switch

From this as the basis:

void setup() {
    // you don't need to verify return status
    Ethernet.begin(mac);

    switch1.setName("Pretty label 1");
    switch1.setIcon("mdi:lightbulb");
    switch1.onCommand(onSwitchCommand);

    switch2.setName("Pretty label 2");
    switch2.setIcon("mdi:lightbulb");
    switch2.onCommand(onSwitchCommand);    

    mqtt.begin(BROKER_ADDR);
}

I made these Foo()-s:

void InitMQTT_Device()
{
          device.setUniqueId(esp_mac.b, sizeof(esp_mac));
          Serial.print(F("MQTT UniqueID :")); Serial.println(device.getUniqueId());

          // set device's details (optional)
          //device.setName(ESP_HostName.c_str());
          device.setName(ESP_HostName);
          Serial.print(F("ESP_HostName :")); Serial.println(ESP_HostName);
          device.setManufacturer("PhySix66");
          device.setModel(esp_model_name);
          device.setSoftwareVersion("0.0.1");

          // This method enables availability for all device types registered on the device.
          // For example, if you have 5 sensors on the same device, you can enable
          // shared availability and change availability state of all sensors using
          // single method call "device.setAvailability(false|true)"
          //device.enableSharedAvailability();

          // Optionally, you can enable MQTT LWT feature. If device will lose connection
          // to the broker, all device types related to it will be marked as offline in
          // the Home Assistant Panel.
          //device.enableLastWill();
}

void InitMQTT_Switches()
{        
          // handle switch state (multi-switch) - OutPut via MCP23x17
          switch0.setName("Switch0");
          //switch0.setIcon("mdi:lightbulb");
          switch0.setIcon("mdi:toggle-switch");
          //switch0.setRetain(true);              //  Sets retain flag for the switch command. If set to true the command produced by Home Assistant will be retained.
          switch0.onCommand(onSwitchCommand);

          switch1.setName("Switch1");
          switch1.setIcon("mdi:toggle-switch");
          switch1.onCommand(onSwitchCommand);    
          //....
          switch7.setName("Switch7");
          switch7.setIcon("mdi:toggle-switch");
          switch7.onCommand(onSwitchCommand);
}

void startMQTT()
{
     if(mqtt_flags & MQTT_FLAG_EN)
     {      
       if(!(AreBitSet(mqtt_flags, (MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT))))
       {
         Serial.println(F("mqtt.h: startMQTT() Not Inited()"));

         IPAddress tempIP(0,0,0,0);
         if(!WiFi.hostByName(BROKER_ADDR, tempIP, 800)) { // Get the IP address of the NTP server
          Serial.print(F("MQTT Req DNS lookup failed for "));   Serial.println(BROKER_ADDR);
          mqtt_flags &= ~MQTT_FLAG_SERVER_AV;
         }
         else
         {
            Serial.print(F("MQTT Req DNS lookup Success for "));   Serial.println(BROKER_ADDR);
            Serial.print(F("IP Addr is "));                        Serial.println(tempIP);
            mqtt_flags |= MQTT_FLAG_SERVER_AV;
         }

        if(mqtt_flags & MQTT_FLAG_SERVER_AV)
        {        
          **InitMQTT_Device();  // <<-  One of these caused the ERROR.
          InitMQTT_Switches();  // <<-  One of these caused the ERROR.**

          if(mqtt_flags & MQTT_FLAG_USE_CREDENTIALS)
          {
            // use this for mqtt-with-credentials
            mqtt.begin(MQTT_ServerName, mqtt_port, mqtt_server_user_name, mqtt_server_password);
          }
          else
          {
            if(mqtt.begin(MQTT_ServerName) == true)
            {
              Serial.println(F("mqtt.begin == true"));
            }
            else
            {
              Serial.println(F("mqtt.begin == false"));
            }
          }

          mqtt_flags |= MQTT_FLAG_INIT;
          Serial.print(F("mqtt_flags: "));    Serial.println(mqtt_flags);
          Serial.println(F("MQTT is Inited"));
        }
      }
    }
    else if(mqtt.isConnected())
    {
      mqtt_flags &= ~(MQTT_FLAG_SERVER_AV | MQTT_FLAG_INIT);
      mqtt.disconnect();
      Serial.println(F("MQTT DisConnected"));
    }
}

And every time, when I reenabled the MQTT via startMQTT(), then I've also "reinitialized" it's device and switches... Or so I thought. Not sure about the details of what's happening, but I only assume, that during the second call of InitMQTT_Device() and InitMQTT_Switches() some memory allocation/curroption occurs, that messed up the Client* _client (var/pointer?) in the PubSubClient.cpp.

Note to noobs like me: To find the problem, I did a reinstall of the librarys where I poked arround (added some extra code): home-assistant-integration and the pubsubclient, and went back to square one. After the reflashing, with basicly the same code as in the exmaple, it still crashed. This got me thinking and lead me to the problem.