VSChina / ESP32_AzureIoT_Arduino

The library for ESP32 Azure IoT for Arduino
MIT License
37 stars 43 forks source link

ESP32 continues restarts once reached subscription limit (e.g. 8000 pcs / day) #18

Open salvq opened 4 years ago

salvq commented 4 years ago

Hello, I reached limit of azure daily msg (8.000) and I am getting error / restarts of ESP32 device.

I am having flashed esp32 device with latest up to date GetStarted.ino.

What has to be done to keep trying to connect rather then restarting ?

Thanks

`22:32:42.858 -> ESP32 Device 22:32:42.858 -> Initializing... 22:32:42.858 -> 5471 22:32:42.858 -> 31665 22:32:42.858 -> 495720 22:32:42.858 -> 582509 22:32:42.858 -> 1968036 22:32:42.858 -> 2502000 22:32:42.858 -> 542195 22:32:42.858 -> 597772 22:32:42.858 -> > WiFi 22:32:42.858 -> Connecting... 22:32:43.444 -> .WiFi connected 22:32:43.444 -> IP address: 22:32:43.444 -> 192.168.1.115 22:32:43.444 -> > IoT Hub 22:32:43.444 -> Info: Initializing SNTP

22:32:44.442 -> Info: SNTP initialization complete

22:32:44.442 -> Info: IoT Hub SDK for C, version 1.1.23

22:32:47.331 -> Info: >>>Connection status: connected

22:32:47.331 -> 2019-October-21 20:32:47 22:32:48.293 -> 2019-October-21 20:32:48 22:32:48.293 -> {"deviceId":"Esp32Device", "messageId":1, "Temperature":23.00, "Humidity":60.50, "rssi":-67, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0} 22:32:48.329 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

22:32:58.326 -> Error: Time:Mon Oct 21 20:32:58 2019 File:C:\Users\rasti\AppData\Local\Arduino15\packages\esp32\hardware\esp32\1.0.4\libraries\AzureIoT\src\Esp32MQTTClient.cpp Func:SendEventOnce Line:316 Waiting for send confirmation, time is up 10016

22:32:58.326 -> 2019-October-21 20:32:58 22:32:58.361 -> {"deviceId":"Esp32Device", "messageId":2, "Temperature":23.00, "Humidity":60.50, "rssi":-68, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0} 22:32:58.361 -> Info: >>>Re-connect.

22:32:58.361 -> Error: Time:Mon Oct 21 20:32:58 2019 File:C:\Users\rasti\AppData\Local\Arduino15\packages\esp32\hardware\esp32\1.0.4\libraries\AzureIoT\src\az_iot\c-utility\pal\src\tlsio_openssl_compact.c Func:tlsio_openssl_destroy Line:213 tlsio_openssl_destroy called while not in TLSIO_STATE_CLOSED.

22:32:58.399 -> Info: >>>Confirmation[0] received for message tracking id = 0 with result = IOTHUB_CLIENT_CONFIRMATION_BECAUSE_DESTROY

22:32:58.399 -> Info: Initializing SNTP

22:32:58.399 -> assertion "Operating mode must not be set while SNTP client is running" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/apps/sntp/sntp.c", line 600, function: sntp_setoperatingmode 22:32:58.433 -> abort() was called at PC 0x4010b16f on core 1 22:32:58.433 -> 22:32:58.433 -> Backtrace: 0x4008c820:0x3ffb1c90 0x4008ca51:0x3ffb1cb0 0x4010b16f:0x3ffb1cd0 0x40123c15:0x3ffb1d00 0x400d4272:0x3ffb1d20 0x400d42ce:0x3ffb1d40 0x400d32ee:0x3ffb1d60 0x400d3622:0x3ffb1da0 0x400d3659:0x3ffb1dc0 0x400d376c:0x3ffb1e00 0x400d2140:0x3ffb1e20 0x400d2434:0x3ffb1f90 0x400dfe1d:0x3ffb1fb0 0x40088f69:0x3ffb1fd0 22:32:58.467 -> 22:32:58.467 -> Rebooting... 22:32:58.467 -> ets Jun 8 2016 00:22:57 22:32:58.467 -> 22:32:58.467 -> rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT) 22:32:58.467 -> configsip: 0, SPIWP:0xee 22:32:58.467 -> clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00 22:32:58.467 -> mode:DIO, clock div:1 22:32:58.467 -> load:0x3fff0018,len:4 22:32:58.467 -> load:0x3fff001c,len:1216 22:32:58.467 -> ho 0 tail 12 room 4 22:32:58.467 -> load:0x40078000,len:9720 22:32:58.498 -> ho 0 tail 12 room 4 22:32:58.498 -> load:0x40080400,len:6352 22:32:58.498 -> entry 0x400806b8 22:32:58.772 -> ESP32 Device 22:32:58.772 -> Initializing... `

beegee-tokyo commented 4 years ago

I am getting the exact same restart message, every now and then, but I am definitely below the subscription limit (2880 pckgs per day).

Looking into the log output

assertion "Operating mode must not be set while SNTP client is running" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/apps/sntp/sntp.c"

points to a internal state machine problem in the library.

Hope someone still maintains this library, last update is one year old ;(

sntp.c is pre-compiled in the Arduino-ESP32 framework, so no chance to check why this is happening.

beegee-tokyo commented 4 years ago

Problem is in _SNTPINIT() in ...\libraries\AzureIoT\src\az_iot\c-utility\pal\lwip\sntp_lwip.c . _sntp_setoperatingmode(SNTP_OPMODEPOLL); cannot be called while the SNTP is active. That leads to the ASSERTION error and reboot.

Changing _SNTPINIT() from

/*Codes_SRS_SNTP_LWIP_30_004: [ SNTP_Init shall initialize the SNTP client, contact the NTP server to set system time, then return 0 to indicate success (lwIP has no failure path). ]*/
int SNTP_Init()
{
    LogInfo("Initializing SNTP");
    sntp_setoperatingmode(SNTP_OPMODE_POLL);
    sntp_init();
    time_t ts = 0;
    // Before 1980 is uninitialized
    while (ts < 10 * 365 * 24 * 3600)
    {
        ThreadAPI_Sleep(1000);
        ts = get_time(NULL);

    }
    LogInfo("SNTP initialization complete");
    return 0;
}

by adding a _SNTPDeinit() should fix the problem.

/*Codes_SRS_SNTP_LWIP_30_004: [ SNTP_Init shall initialize the SNTP client, contact the NTP server to set system time, then return 0 to indicate success (lwIP has no failure path). ]*/
int SNTP_Init()
{
    LogInfo("Initializing SNTP");
    SNTP_Deinit();
    sntp_setoperatingmode(SNTP_OPMODE_POLL);
    sntp_init();
    time_t ts = 0;
    // Before 1980 is uninitialized
    while (ts < 10 * 365 * 24 * 3600)
    {
        ThreadAPI_Sleep(1000);
        ts = get_time(NULL);

    }
    LogInfo("SNTP initialization complete");
    return 0;
}

Running tests right now.

salvq commented 4 years ago

That's great, you are right, restarts have been here and there.

I am also testing the change...

beegee-tokyo commented 4 years ago

Mine is up now for 5 hours without restart. Never got it running that long without restart. My internet connection is very bad and on and off, so it the ESP32 has to reconnect quite often.

:thumbsup:

salvq commented 4 years ago

Mine just got restarted after an hour, hmmm :(

09:42:27.651 -> {"deviceId":"Esp32Device", "messageId":48, "Temperature":22.70, "Humidity":74.10, "rssi":-64, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
09:42:27.685 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

09:42:27.992 -> Info: >>>Confirmation[47] received for message tracking id = 47 with result = IOTHUB_CLIENT_CONFIRMATION_OK

09:42:27.992 -> Send Confirmation Callback finished.
09:43:21.101 -> {"deviceId":"Esp32Device", "messageId":49, "Temperature":22.80, "Humidity":74.40, "rssi":-64, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
09:43:21.135 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

09:43:21.135 -> Info: >>>Connection status: timeout

09:44:36.006 -> {"deviceId":"Esp32Device", "messageId":50, "Temperature":23.00, "Humidity":73.60, "rssi":-66, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
09:44:36.040 -> Info: >>>Re-connect.

09:44:36.040 -> Info: >>>Confirmation[48] received for message tracking id = 48 with result = IOTHUB_CLIENT_CONFIRMATION_BECAUSE_DESTROY

09:44:36.040 -> Info: Initializing SNTP

09:44:36.040 -> assertion "Operating mode must not be set while SNTP client is running" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/apps/sntp/sntp.c", line 600, function: sntp_setoperatingmode
09:44:36.075 -> abort() was called at PC 0x4010af03 on core 1
09:44:36.075 -> 
09:44:36.075 -> Backtrace: 0x4008c820:0x3ffb1c90 0x4008ca51:0x3ffb1cb0 0x4010af03:0x3ffb1cd0 0x40123741:0x3ffb1d00 0x400d4236:0x3ffb1d20 0x400d4292:0x3ffb1d40 0x400d325e:0x3ffb1d60 0x400d3592:0x3ffb1da0 0x400d35c9:0x3ffb1dc0 0x400d36dc:0x3ffb1e00 0x400d20a6:0x3ffb1e20 0x400d23a4:0x3ffb1f90 0x400dfbad:0x3ffb1fb0 0x40088f69:0x3ffb1fd0
09:44:36.110 -> 
09:44:36.110 -> Rebooting...
09:44:36.110 -> ets Jun  8 2016 00:22:57
09:44:36.110 -> 
09:44:36.110 -> rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
09:44:36.110 -> configsip: 0, SPIWP:0xee
09:44:36.110 -> clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
09:44:36.110 -> mode:DIO, clock div:1
09:44:36.110 -> load:0x3fff0018,len:4
09:44:36.110 -> load:0x3fff001c,len:1216
09:44:36.110 -> ho 0 tail 12 room 4
09:44:36.144 -> load:0x40078000,len:9720
09:44:36.144 -> ho 0 tail 12 room 4
09:44:36.144 -> load:0x40080400,len:6352
09:44:36.144 -> entry 0x400806b8
09:44:36.419 -> ESP32 Device
09:44:36.419 -> Initializing...

How to verify if the change in library has been compiled into esp32 ?`

salvq commented 4 years ago

I just put few more logging information to see if the routine is being executed

beegee-tokyo commented 4 years ago

Don't know for ArduinoIDE how to check. I am using PlatformIO and it is still stable running now.

salvq commented 4 years ago

Hi, I don't get it, this is what I have added into file sntp_lwip.c situated in folder \libraries\ESP32_AzureIoT_Arduino\src\az_iot\c-utility\pal\lwip\

int SNTP_Init()
{
    LogInfo("Initializing SNTP");
    LogInfo("Calling SNTP_Deinit");
    SNTP_Deinit();
    LogInfo("Finished SNTP_Deinit");
    sntp_setoperatingmode(SNTP_OPMODE_POLL);
    sntp_init();
    time_t ts = 0;
    // Before 1980 is uninitialized
    while (ts < 10 * 365 * 24 * 3600)
    {
        ThreadAPI_Sleep(1000);
        ts = get_time(NULL);

    }
    LogInfo("SNTP initialization complete");
    return 0;
}

/*Codes_SRS_SNTP_LWIP_30_005: [ SNTP_Denit shall deinitialize the SNTP client. ]*/
void SNTP_Deinit()
{
    LogInfo("In SNTP_Deinit call");
    sntp_stop();
    LogInfo("Out SNTP_Deinit call");
} 

However when I got restart I do not see any logging, what am I doing wrong ? Please help...

12:16:13.195 -> {"deviceId":"Esp32Device", "messageId":49, "Temperature":22.20, "Humidity":75.70, "rssi":-68, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
12:16:13.229 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

12:16:13.434 -> Info: >>>Confirmation[48] received for message tracking id = 48 with result = IOTHUB_CLIENT_CONFIRMATION_OK

12:16:13.434 -> Send Confirmation Callback finished.
12:17:26.760 -> {"deviceId":"Esp32Device", "messageId":50, "Temperature":22.30, "Humidity":75.70, "rssi":-62, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
12:17:26.794 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

12:17:26.794 -> Info: >>>Connection status: timeout

12:18:24.156 -> {"deviceId":"Esp32Device", "messageId":51, "Temperature":22.30, "Humidity":76.40, "rssi":-72, "cycle":1, "green":1, "orange":0, "red":0, "blue":0, "reject":0, "equipment":0, "operator":0, "changeover":0, "material":0}
12:18:24.191 -> Info: >>>Re-connect.

12:18:24.191 -> Info: >>>Confirmation[49] received for message tracking id = 49 with result = IOTHUB_CLIENT_CONFIRMATION_BECAUSE_DESTROY

12:18:24.191 -> Info: Initializing SNTP

12:18:24.191 -> assertion "Operating mode must not be set while SNTP client is running" failed: file "/home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/lwip/lwip/src/apps/sntp/sntp.c", line 600, function: sntp_setoperatingmode
12:18:24.226 -> abort() was called at PC 0x4010af03 on core 1
12:18:24.226 -> 
12:18:24.226 -> Backtrace: 0x4008c820:0x3ffb1c90 0x4008ca51:0x3ffb1cb0 0x4010af03:0x3ffb1cd0 0x40123741:0x3ffb1d00 0x400d4236:0x3ffb1d20 0x400d4292:0x3ffb1d40 0x400d325e:0x3ffb1d60 0x400d3592:0x3ffb1da0 0x400d35c9:0x3ffb1dc0 0x400d36dc:0x3ffb1e00 0x400d20a6:0x3ffb1e20 0x400d23a4:0x3ffb1f90 0x400dfbad:0x3ffb1fb0 0x40088f69:0x3ffb1fd0
12:18:24.260 -> 
12:18:24.260 -> Rebooting...
12:18:24.260 -> ets Jun  8 2016 00:22:57
12:18:24.260 -> 
12:18:24.260 -> rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
12:18:24.260 -> configsip: 0, SPIWP:0xee
12:18:24.260 -> clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
12:18:24.260 -> mode:DIO, clock div:1
12:18:24.260 -> load:0x3fff0018,len:4
12:18:24.294 -> load:0x3fff001c,len:1216
12:18:24.294 -> ho 0 tail 12 room 4
12:18:24.294 -> load:0x40078000,len:9720
12:18:24.294 -> ho 0 tail 12 room 4
12:18:24.294 -> load:0x40080400,len:6352
12:18:24.294 -> entry 0x400806b8
12:18:24.571 -> ESP32 Device
12:18:24.571 -> Initializing...
salvq commented 4 years ago

It was cached in tmp folder therefore changes have never been reflected :)

Anyway testing again and hope no restarts...

beegee-tokyo commented 4 years ago

On PlatformIO I have a clean button which deletes all cached prior compiled sources and starts from scratch.

With the changed library I had no restart in 12 hours.

salvq commented 4 years ago

Same here, no restarts anymore but I have checked all the records in azure and there are some missing records when there is an initiate, can you check on your end ?

In my azure storage I am seeing messageId 44 and then 46, messageId 45 is missing..,this is not good, I do not know if this was the behavioue before because I did not checked message before and after restart

19:20:22.057 -> {"deviceId":"Esp32Device", "messageId":44, "Temperature":22.30, "Humidity":72.20, "rssi":-72, "cycle":1}
19:20:22.420 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

19:20:22.468 -> Info: >>>Confirmation[43] received for message tracking id = 43 with result = IOTHUB_CLIENT_CONFIRMATION_OK

19:20:22.503 -> Send Confirmation Callback finished.
19:21:41.330 -> {"deviceId":"Esp32Device", "messageId":45, "Temperature":22.30, "Humidity":72.10, "rssi":-72, "cycle":1}
19:21:41.511 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

19:21:41.511 -> Info: >>>Connection status: timeout

19:22:52.104 -> {"deviceId":"Esp32Device", "messageId":46, "Temperature":22.30, "Humidity":71.80, "rssi":-72, "cycle":1}
19:22:52.138 -> Info: >>>Re-connect.

19:22:52.138 -> Info: >>>Confirmation[44] received for message tracking id = 44 with result = IOTHUB_CLIENT_CONFIRMATION_BECAUSE_DESTROY

19:22:52.138 -> Info: Initializing SNTP1

19:22:52.138 -> Info: Calling SNTP_Deinit

19:22:52.138 -> Info: In SNTP_Deinit call

19:22:52.138 -> Info: Out SNTP_Deinit call

19:22:52.138 -> Info: Finished SNTP_Deinit

19:22:53.137 -> Info: SNTP initialization complete

19:22:53.137 -> Info: IoT Hub SDK for C, version 1.1.23

19:22:56.461 -> Info: >>>Connection status: connected

19:22:56.461 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

19:22:56.974 -> Info: >>>Confirmation[0] received for message tracking id = 45 with result = IOTHUB_CLIENT_CONFIRMATION_OK

19:22:56.974 -> Send Confirmation Callback finished.
19:24:05.493 -> {"deviceId":"Esp32Device", "messageId":47, "Temperature":22.30, "Humidity":71.60, "rssi":-72, "cycle":1}
19:24:05.527 -> Info: >>>IoTHubClient_LL_SendEventAsync accepted message for transmission to IoT Hub.

19:24:05.527 -> {"desired":{"$version":1},"reported":{"$version":1}}
19:24:05.836 -> Info: >>>Confirmation[1] received for message tracking id = 0 with result = IOTHUB_CLIENT_CONFIRMATION_OK

19:24:05.836 -> Send Confirmation Callback finished.
19:25:22.846 -> {"deviceId":"Esp32Device", "messageId":48, "Temperature":22.30, "Humidity":71.40, "rssi":-73, "cycle":1}
salvq commented 4 years ago

I found why I am getting timeouts every 48min, it is because of SAS token expiration.

This is specified in iothubtransport_mqtt_common.c as

#define SAS_TOKEN_DEFAULT_LIFETIME          3600
#define SAS_REFRESH_MULTIPLIER              .8

I just change it now to bigger value to avoid so frequent reconnects (this is due to security as far as i can see). Testing now, this is not an issue rather feature...

salvq commented 4 years ago

The best way how to keep having all the data publish is to open connection, send message & close connection. This was I was able to stream data for one day without loosing any...

For closing connection I have used function after sending and then connect again: Esp32MQTTClient_Close();

Closing an issue as it was fixed by @beegee-tokyo

beegee-tokyo commented 4 years ago

@anthrene @lirenhe @salvq I would not close it until someone here has merged the fix.

salvq commented 4 years ago

Right I see, reopening....

snympi commented 3 years ago

Has this change been merged into the current repo?