thingsboard / thingsboard-client-sdk

Client SDK to connect with ThingsBoard IoT Platform from IoT devices (Arduino, Espressif, etc.)
MIT License
158 stars 124 forks source link

OTA update downloads but fails at the end #152

Closed jortega11 closed 1 year ago

jortega11 commented 1 year ago

I am trying to perform an OTA update using my ESP8266. When I try to do the update, it starts downloading it and it fails at the end, saying that the firmware update failed.

Most of the chunks fail to download, but after retrying it can manage to download them and continue the download of the firmware. However, when it reaches the end of the download the Serial Monitor output is not successful.

[TB] Receive chunk (121), with size (4096) bytes      
[TB] Error during Update.write
Progress 98.39%
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/121)
[TB] Receive chunk (121), with size (4096) bytes      
[TB] Error during Update.write
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/122)
[TB] Receive chunk (122), with size (4096) bytes      
[TB] Error during Update.write
Progress 99.19%
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/122)
[TB] Receive chunk (122), with size (4096) bytes      
[TB] Error during Update.write
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/123)
[TB] Receive chunk (123), with size (1728) bytes      
[TB] Error during Update.write
Downloading firmware failed

I have seen that at the beginning of the update the serial monitor says that there was an error during Update.begin so I think that the problem is because of that.

[TB] A new Firmware is available:
[TB] (1.1) => (1.2)
[TB] Attempting to download over MQTT...
Progress 0.81%
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/1)
[TB] Receive chunk (1), with size (4096) bytes        
[TB] Error during Update.write
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/0)
[TB] Receive chunk (0), with size (4096) bytes        
[TB] Error during Update.begin
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/1)
[TB] Receive chunk (1), with size (4096) bytes        
[TB] Error during Update.write
Progress 1.61%
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/1)
[TB] Receive chunk (1), with size (4096) bytes        
[TB] Error during Update.write
[TB] Callback onMQTTMessage from topic: (v2/fw/response/0/chunk/2)

The code I am using right now is the following: (I have removed credentials like the WiFi SSID, Password, Server IP and Token)

#include <ESP8266WiFi.h>
#define THINGSBOARD_ENABLE_PROGMEM 0
#include <ThingsBoard.h>
#include <Arduino.h>
#include <Adafruit_AHTX0.h>
#include <TBPubSubClient.h>

// Firmware title and version used to compare with remote version, to check if an update is needed.
// Title needs to be the same and version needs to be different --> downgrading is possible
constexpr char CURRENT_FIRMWARE_TITLE[] = "Test";
constexpr char CURRENT_FIRMWARE_VERSION[] = "1.2";

// Firmware state send at the start of the firmware, to inform the cloud about the current firmware and that it was installed correctly,
// especially important when using OTA update, because the OTA update sends the last firmware state as UPDATING, meaning the device is restarting
// if the device restarted correctly and has the new given firmware title and version it should then send thoose to the cloud with the state UPDATED,
// to inform any end user that the device has successfully restarted and does actually contain the version it was flashed too
constexpr char FW_STATE_UPDATED[] = "UPDATED";

// Maximum amount of retries we attempt to download each firmware chunck over MQTT
constexpr uint8_t FIRMWARE_FAILURE_RETRIES = 255U;

// Size of each firmware chunck downloaded over MQTT,
// increased packet size, might increase download speed
constexpr uint16_t FIRMWARE_PACKET_SIZE = 4096U;

// PROGMEM can only be added when using the ESP32 WiFiClient,
// will cause a crash if using the ESP8266WiFiSTAClass instead.
constexpr char WIFI_SSID[] = "";
constexpr char WIFI_PASSWORD[] = "";

constexpr char TOKEN[] = "";

// Thingsboard we want to establish a connection too
constexpr char THINGSBOARD_SERVER[] = "";

// MQTT port used to communicate with the server, 1883 is the default unencrypted MQTT port,
// whereas 8883 would be the default encrypted SSL MQTT port
constexpr uint16_t THINGSBOARD_PORT = 1883U;

// Maximum size packets will ever be sent or received by the underlying MQTT client,
// if the size is to small messages might not be sent or received messages will be discarded
constexpr uint32_t MAX_MESSAGE_SIZE = 512U;

// Baud rate for the debugging serial connection
// If the Serial output is mangled, ensure to change the monitor speed accordingly to this variable
constexpr uint32_t SERIAL_DEBUG_BAUD = 115200U;

// Initialize underlying client, used to establish a connection
WiFiClient espClient;

// Initialize ThingsBoard instance with the maximum needed buffer size
ThingsBoard tb(espClient, MAX_MESSAGE_SIZE);

// Statuses for updating
bool currentFWSent = false;
bool updateRequestSent = false;

void MQTTconnect();
void printTempValues(int sleep);
bool Chrono(int T);
void goToSleep();

/// @brief Initalizes WiFi connection,
// will endlessly delay until a connection has been successfully established
void InitWiFi() {
  Serial.println("Connecting to AP ...");
  // Attempting to establish a connection to the given WiFi network
  WiFi.begin(WIFI_SSID, WIFI_PASSWORD);
  while (WiFi.status() != WL_CONNECTED) {
    // Delay 500ms until a connection has been succesfully established
    delay(500);
    Serial.print(".");
  }
  Serial.println("Connected to AP");
}

/// Reconnects the WiFi uses InitWiFi if the connection has been removed
/// Returns true as soon as a connection has been established again
bool reconnect() {
  // Check to ensure we aren't connected yet
  const wl_status_t status = WiFi.status();
  if (status == WL_CONNECTED) {
    return true;
  }

  // If we aren't establish a new connection to the given WiFi network
  InitWiFi();
  return true;
}

/// Updated callback that will be called as soon as the firmware update finishes
/// success Either true (update successful) or false (update failed)
void updatedCallback(const bool& success) {
  if (success) {
    Serial.println("Done, Reboot now");
    ESP.restart();
    return;
  }
  Serial.println("Downloading firmware failed");
}

/// Progress callback that will be called every time we downloaded a new chunk successfully
void progressCallback(const uint32_t& currentChunk, const uint32_t& totalChuncks) {
  Serial.printf("Progress %.2f%%\n", static_cast<float>(currentChunk * 100U) / totalChuncks);
}

const OTA_Update_Callback callback(&progressCallback, &updatedCallback, CURRENT_FIRMWARE_TITLE, CURRENT_FIRMWARE_VERSION, FIRMWARE_FAILURE_RETRIES, FIRMWARE_PACKET_SIZE, 60000);

Adafruit_AHTX0 aht;

// -- AHT Red LED pin.
//      It is configured to be pulled LOW when BME280 sensor is NOT detected.
#define RED_LED_AHT D3//0//

// -- Power Supply pin.
//      It is configured to be pulled High to feed ADS1115, so when ESP8266 is on DeepSleep mode
//      it will be pulled Low and battery led will be turned off.
#define POWER_SUP_PIN D6//12//

// -- ADS Red LED pin.
//      It is configured to be pulled low when BME280 sensor is NOT detected.
#define RED_LED_ADS D8//15//D7//13//

PubSubClient client(espClient);

// System variables
bool needReset = false;
float SleepTime = 3e7; // 1.8e9 is 30 minutes
int BlinkStatus = 0;
unsigned aht_status;

void setup() {
  // Initalize serial connection for debugging
  Serial.begin(SERIAL_DEBUG_BAUD);

  delay(100);

 // -- AHT20 initialization
  aht_status = aht.begin();
  delay(200);
  Serial.print("AHT20 Status: "); Serial.println(aht_status);

  delay(200);

  InitWiFi();

}

void loop() {
  //client.loop();
    if (needReset){
      Serial.println("Rebooting after 1 second.");
      delay(1000);
      ESP.restart();
    }

  if (!reconnect()) {
    return;
  }

  if (!tb.connected()) {
    // Reconnect to the ThingsBoard server,
    // if a connection was disrupted or has not yet been established
    Serial.printf("Connecting to: (%s) with token (%s)\n", THINGSBOARD_SERVER, TOKEN);
    if (!tb.connect(THINGSBOARD_SERVER, TOKEN, THINGSBOARD_PORT)) {
      Serial.println("Failed to connect");
      return;
    }
  }
  else{  
      printTempValues(1); //argument 1 is for sending device to DeepSleep
    }

  }

  if (!currentFWSent) {
    currentFWSent = tb.Firmware_Send_Info(CURRENT_FIRMWARE_TITLE, CURRENT_FIRMWARE_VERSION) && tb.Firmware_Send_State(FW_STATE_UPDATED);
  }

  if (!updateRequestSent) {
    Serial.println("Firwmare Update Subscription...");
    // See https://thingsboard.io/docs/user-guide/ota-updates/
    // to understand how to create a new OTA pacakge and assign it to a device so it can download it.
    updateRequestSent = tb.Subscribe_Firmware_Update(callback);
  }

  client.loop();
  tb.loop();
}

//////////////////// Function for mapping between two ranges ////////////////////
float mapfloat(float x, float in_min, float in_max, float out_min, float out_max)
{
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}

//////////////////// Print values of AHT20 sensor ////////////////////
void printTempValues(int sleep) {

  // -- Now, get temperature and humidity values.
  sensors_event_t humidity, temp;
  aht.getEvent(&humidity, &temp);
  Serial.print("Temperature: "); Serial.print(temp.temperature); Serial.println(" °C");

  Serial.print("Humidity: "); Serial.print(humidity.relative_humidity); Serial.println("% rH");

  Serial.print("Sent telemetry for 'temperature': ");Serial.println(tb.sendTelemetryFloat("temperature", temp.temperature));
  Serial.print("Sent telemetry for 'humidity': ");Serial.println(tb.sendTelemetryFloat("humidity", humidity.relative_humidity));
  delay(1000);
  if (sleep == 1){
    delay(1000);
    goToSleep();
  }      
}

//////////////////// Check sleep time and send ESP to sleep ////////////////////
void goToSleep() {
  Serial.println("Now I will sleep for 30 seconds");
  delay(200);
  ESP.deepSleep(SleepTime); // 3.6e9 is 1 hour =  3.600.000.000 microseconds

}
MathewHDYT commented 1 year ago

Can you try if the update works if you do not enter deepSleep on the ESP8266. I'm not sure if flashing the firmware onto the device is supported while the device is in deep sleep mode.

It also would be good to know, which version of the library you are using. I am assuming you are still using v0.10.2. I would recommend an upgrade to v0.11.1 and implement the fix in #149. At least for both of them in the issue the OTA update worked.

jortega11 commented 1 year ago

Yes, I forgot to mention I was using v0.10.2.

After upgrading the version of the library I was getting the error as expected:

[TB] No keys that we subscribed too were changed, skipping callback
[TB] Unable to de-serialize received json data with error (InvalidInput)

After implementing the fix in #149 I get:

[TB] Failed to initalize flash updater      
[TB] Receive chunk (0), with size (4096) bytes

Regarding the deepSleep: The goal of this project is to have a remote device that will be sleeping most of the time and only wake up to take a few measurements and go back to sleep right after. The user should have the ability to update the code remotely using OTA updates.

Since the device is subscribed to the firmware update topic it will not update the firmware while it is sleeping because it will not be connected to the Internet so the device is not able to get the data needed for it. So, we were thinking about updating the code on ThingsBoard and start the update once the device wakes up. (We would have some flags to prevent the ESP8266 go to sleep before we make sure the device has correctly updates but we have not implemented it yet.)

We have only tried to update the code of the device while it is not sleeping to check if it is able to download and update correctly and then is when we get this error. (After this we would work on updating the code while it is sleeping so it can update after waking up but, as I said, it is not implemented yet because we are currently working on getting a simpler version working).

MathewHDYT commented 1 year ago

Okay bad news, this is relatively surely not a problem with this library, because it is the underlying component that should flash the memory that fails to initalize.

Good news however it is probably not an implementation problem but a problem with how you are using the update.

The initial flash begin can fail because of multiple errors, but one of them and the one I find currently most likely is that the binary size you receive from the server is too big for your OTA data partition or that you have not created a 2nd OTA data partition altogether.

What kind of partitions.csv file are you using, because I am assuming you are using the default one and did not change the configuration for it. Because if you did not that might be the cause for the update initialisation failing, because to do an OTA update the partitions.csv needs two app partitions and the new firmware is then flashed onto the non-active one. Once that has been completed we reboot onto the new flashed firmware.

Therefore the OTA update will of course not work if there is no 2nd app partition. An example of how the partition should look like for an OTA update can be found below.

# Name, Type, SubType, Offset, Size, Flags
otadata,            data,       ota,            0xE000,           0x2000,
app0,               app,        ota_0,                ,           0x300000,
app1,               app,        ota_1,                ,           0x300000,
spiffs,             data,       spiffs,               ,           0x3F6000,
nvs,                data,       nvs,                  ,           0x25000,
nvs_key,            data,       nvs_keys,             ,           0x25000,
test,               app,        test,                 ,           0x300000,
jortega11 commented 1 year ago

Okay good news, that was the problem. I did not fix it exactly as you said because apparently for the ESP8266 module the way that changing the flash size is this. So I changed the board_build.ldscript flag in my platformio.ini and now the OTA update works.

MathewHDYT commented 1 year ago

Nice to know that the partitions file fixed your issue. Would be nice if you could close this issue as resolved.

jortega11 commented 1 year ago

Yes, thank you!