Closed imammy-hacomono closed 1 year ago
@kstribrnAmzn @pvyawaha @paulbartell I'm sorry to bother you suddenly, but I'm troubled by this situation. Do you happen to know anything about it? Thank you.
Can someone please help me? I'm having a very difficult time...
Hello @imammy-hacomono - sorry to hear you're having troubles and I hope we can help sort this out quickly.
From what I can see from the logs your provided (thank you) the failure steps from the state of the device and the OTA job being out of sync. The log message...
W (48372) AWS_OTA: Rejecting new image and rebooting:The job is in the self-test state while the platform is not.
Occurs when their is a mismatch between the OTA agent state and PAL states, which you can see here. You can also read a bit more on this on the docs.
I'd recommend setting up a debugger and setting a breakpoint in this function and to find out why it's returning a different image state, and what image state that is. Somehow your PAL isn't setting the correct image state, leading to this issue. You may also consider checking the setPlatformImageState
function implementation.
It would also be helpful if you could provide a link to the demo that you're running. This would allow me to look into the PAL/HAL functions more.
@kstribrnAmzn Thanks for responding!!
Here is the base for a running demo. Only the transport layer of this has been modified to use the cellular interface and is running. https://github.com/espressif/esp-aws-iot/tree/release/202210.01-LTS/examples/ota/ota_mqtt
OtaPalStatus_t otaPal_SetPlatformImageState(OtaFileContext_t* const pFileContext, OtaImageState_t eState) {
OtaPalMainStatus_t mainErr = OtaPalSuccess;
int state;
(void) pFileContext;
ESP_LOGI(TAG, "%s, %d", __func__, eState);
switch (eState) {
case OtaImageStateAccepted:
ESP_LOGI(TAG, "Set image as valid one!");
state = ESP_OTA_IMG_VALID;
break;
case OtaImageStateRejected:
ESP_LOGW(TAG, "Set image as invalid!");
state = ESP_OTA_IMG_INVALID;
break;
case OtaImageStateAborted:
ESP_LOGW(TAG, "Set image as aborted!");
state = ESP_OTA_IMG_ABORTED;
break;
case OtaImageStateTesting:
ESP_LOGW(TAG, "Set image as testing!");
return OTA_PAL_COMBINE_ERR(OtaPalSuccess, 0);
default:
ESP_LOGW(TAG, "Set image invalid state!");
return OTA_PAL_COMBINE_ERR(OtaPalBadImageState, 0);
}
uint32_t ota_flags;
/* Get current active (running) firmware image flags */
esp_err_t ret = aws_esp_ota_get_boot_flags(&ota_flags, true);
if (ret != ESP_OK) {
ESP_LOGE(TAG, "Failed to get ota flags %d", ret);
return OTA_PAL_COMBINE_ERR(OtaPalCommitFailed, 0);
}
/* If this is first request to set platform state, post bootup and there is not OTA being
* triggered yet, then operate on active image flags, else use passive image flags */
if ((ota_ctx.cur_ota == NULL) && (ota_ctx.data_write_len == 0)) {
if (ota_flags == ESP_OTA_IMG_PENDING_VERIFY) {
ret = aws_esp_ota_set_boot_flags(state, true);
if (ret != ESP_OK) {
ESP_LOGE(TAG, "Failed to set ota flags %d", ret);
return OTA_PAL_COMBINE_ERR(OtaPalCommitFailed, 0);
} else {
/* RTC watchdog timer can now be stopped */
disable_rtc_wdt();
}
} else {
ESP_LOGW(TAG, "Image not in self test mode %" PRIu32 "", ota_flags);
mainErr = ota_flags == ESP_OTA_IMG_VALID ? OtaPalSuccess : OtaPalCommitFailed;
}
/* For debug purpose only, get current flags */
aws_esp_ota_get_boot_flags(&ota_flags, true);
} else {
if ((eState == OtaImageStateAccepted) && (ota_ctx.valid_image == false)) {
/* Incorrect update image or not yet validated */
return OTA_PAL_COMBINE_ERR(OtaPalCommitFailed, 0);
}
if (ota_flags != ESP_OTA_IMG_VALID) {
ESP_LOGE(TAG, "Currently executing firmware not marked as valid, abort");
return OTA_PAL_COMBINE_ERR(OtaPalCommitFailed, 0);
}
ret = aws_esp_ota_set_boot_flags(state, false);
if (ret != ESP_OK) {
ESP_LOGE(TAG, "Failed to set ota flags %d", ret);
return OTA_PAL_COMBINE_ERR(OtaPalCommitFailed, 0);
}
}
return OTA_PAL_COMBINE_ERR(mainErr, 0);
}
I have tried to log the timing of setPlatformImageState and getPlatformImageState calls.
I (279617) [MQTT OTA]: Received: 132 Queued: 132 Processed: 131 Dropped: 0
I (280067) ota_pal: Signature verification succeeded.
I (280067) AWS_OTA: Received entire update and validated the signature.
I (284887) [MQTT OTA]: Received: 132 Queued: 132 Processed: 131 Dropped: 0
I (286177) [MQTT OTA]: PINGRESP should not be handled by the application callback when using MQTT_ProcessLoop.
I (286187) [MQTT OTA]: Sent PUBLISH packet to broker $aws/things/device-linkage-test/jobs/AFR_OTA-ota-test/update to broker.
I (286197) [MQTT OTA]: Sent PUBLISH message: {"status":"IN_PROGRESS","statusDetails":{"self_test":"ready","updatedBy":"0x00010002"}}
I (286227) [MQTT OTA]: Received OtaJobEventActivate callback from OTA Agent.
I (286237) esp_image: segment 0: paddr=00110020 vaddr=3f400020 size=27788h (161672) map
I (286287) esp_image: segment 1: paddr=001377b0 vaddr=3ffb0000 size=021e4h ( 8676)
I (286297) esp_image: segment 2: paddr=0013999c vaddr=40080000 size=0667ch ( 26236)
I (286307) esp_image: segment 3: paddr=00140020 vaddr=400d0020 size=4bbc4h (310212) map
I (286397) esp_image: segment 4: paddr=0018bbec vaddr=4008667c size=07c40h ( 31808)
I (286407) esp_image: segment 0: paddr=00110020 vaddr=3f400020 size=27788h (161672) map
I (286457) esp_image: segment 1: paddr=001377b0 vaddr=3ffb0000 size=021e4h ( 8676)
I (286457) esp_image: segment 2: paddr=0013999c vaddr=40080000 size=0667ch ( 26236)
I (286477) esp_image: segment 3: paddr=00140020 vaddr=400d0020 size=4bbc4h (310212) map
I (286567) esp_image: segment 4: paddr=0018bbec vaddr=4008667c size=07c40h ( 31808)
I (286637) [MQTT OTA]: PUBACK received for packet id 27.
I (286637) [MQTT OTA]: Received: 132 Queued: 132 Processed: 132 Dropped: 0
I (286897) [MQTT Sub Manager]: Invoking subscription callback of matching topic filter: TopicFilter=$aws/things/+/jobs/#, TopicName=$aws/things/device-linkage-test/jobs/AFR_OTA-ota-test/update/accepted
E (286917) [MQTT OTA]: prvMqttJobCallback
W (286917) [MQTT OTA]: Received job message $aws/things/device-linkage-test/jobs/AFR_OTA-ota-test/update/accepted{"timestamp":1681262648}or�accrdyaf size 24.
I (286937) [MQTT OTA]: Received: 132 Queued: 132 Processed: 132 Dropped: 0
////////////////////////////////////////////////////////////////////////////////
Block transfer complete!
////////////////////////////////////////////////////////////////////////////////
////////////////////////////////////////////////////////////////////////////////
Reboot!
////////////////////////////////////////////////////////////////////////////////
ets Jul 29 2019 12:21:46
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:7176
load:0x40078000,len:15500
ho 0 tail 12 room 4
load:0x40080400,len:4072
0x40080400: _init at ??:?
entry 0x40080670
I (29) boot: ESP-IDF v5.0.1-dirty 2nd stage bootloader
I (29) boot: compile time 17:05:26
I (29) boot: chip revision: v3.0
I (31) boot.esp32: SPI Speed : 40MHz
I (36) boot.esp32: SPI Mode : DIO
I (40) boot.esp32: SPI Flash Size : 4MB
I (45) boot: Enabling RNG early entropy source...
I (50) boot: Partition Table:
I (54) boot: ## Label Usage Type ST Offset Length
I (61) boot: 0 nvs WiFi data 01 02 00009000 00004000
I (69) boot: 1 otadata OTA data 01 00 0000d000 00002000
I (76) boot: 2 phy_init RF data 01 01 0000f000 00001000
I (84) boot: 3 factory factory app 00 00 00010000 00100000
I (91) boot: 4 ota_0 OTA app 00 10 00110000 00100000
I (99) boot: 5 ota_1 OTA app 00 11 00210000 00100000
I (106) boot: 6 storage Unknown data 01 82 00310000 000f0000
I (114) boot: End of partition table
I (118) esp_image: segment 0: paddr=00110020 vaddr=3f400020 size=27788h (161672) map
I (185) esp_image: segment 1: paddr=001377b0 vaddr=3ffb0000 size=021e4h ( 8676) load
I (189) esp_image: segment 2: paddr=0013999c vaddr=40080000 size=0667ch ( 26236) load
I (202) esp_image: segment 3: paddr=00140020 vaddr=400d0020 size=4bbc4h (310212) map
I (314) esp_image: segment 4: paddr=0018bbec vaddr=4008667c size=07c40h ( 31808) load
I (334) boot: Loaded app from partition at offset 0x110000
I (334) boot: Disabling RNG early entropy source...
I (346) cpu_start: Pro cpu up.
I (346) cpu_start: Starting app cpu, entry point is 0x400813f4
0x400813f4: call_start_cpu1 at /Users/keigo.imaizumi/esp/esp-idf/components/esp_system/port/cpu_start.c:142
I (332) cpu_start: App cpu up.
I (363) cpu_start: Pro cpu start user code
I (363) cpu_start: cpu freq: 160000000 Hz
I (363) cpu_start: Application information:
I (367) cpu_start: Project name: iot-device-controller-fw-v3
I (374) cpu_start: App version: 5bdbdf7-dirty
I (380) cpu_start: Compile time: Apr 12 2023 10:12:38
I (386) cpu_start: ELF file SHA256: 1f1d15768af98020...
Warning: checksum mismatch between flashed and built applications. Checksum of built application is 120552fc3fb5b75e24bb1a1cce0866f4472dd65f6f4aefe48e1ccf7d46d14f5a
I (392) cpu_start: ESP-IDF: v5.0.1-dirty
I (397) cpu_start: Min chip rev: v0.0
I (402) cpu_start: Max chip rev: v3.99
I (407) cpu_start: Chip rev: v3.0
I (412) heap_init: Initializing. RAM available for dynamic allocation:
I (419) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (425) heap_init: At 3FFB7A28 len 000285D8 (161 KiB): DRAM
I (431) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (437) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (444) heap_init: At 4008E2BC len 00011D44 (71 KiB): IRAM
I (451) spi_flash: detected chip: generic
I (455) spi_flash: flash io: dio
W (458) spi_flash: Detected size(16384k) larger than the size in the binary image header(4096k). Using the size in the binary image header.
I (473) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
I (483) [Initialize Task]: ESP-IDF Version:v5.0.1-dirty
I (503) [cellular_bg96_api]: Cellular_Init
I (503) [cellular_common_api]: Cellular_CommonInit
I (503) [cellular_common]: _Cellular_LibInit
I (503) [cellular_pktio]: _Cellular_PktioInit
I (513) [cellular_platform]: Creating new thread.
I (513) [cellular_platform]: New thread created.
I (513) [cellular_platform]: prvThreadRoutineWrapper
I (523) [cellular_pktio]: _pktioReadThread
I (523) [cellular_pktio]: Reader thread created
I (533) [bg96_uart]: prvBg96UartCellularCommInterfaceOpen
I (23863) [MQTT OTA]: OTA over MQTT demo, Application version 4.1.2
I (23873) [MQTT OTA]: Establishing a TLS session to hogehoge.iot.ap-northeast-1.amazonaws.com:8883.
W (23883) [MbedtlsTransport]: TLS_FreeRTOS_Connect
I (23893) [TcpSocketWrapper]: Created CELLULAR Socket 0x3ffc4478.
I (23893) [TcpSocketWrapper]: Ip address hogehoge.iot.ap-northeast-1.amazonaws.com port 8883
I (23903) [cellular_pkhandler]: >>>>>Start sending [AT+QIOPEN=1,0,"TCP","hogehoge.iot.ap-northeast-1.amazonaws.com",8883,0,0]<<<<<
I (31323) [MQTT OTA]: MQTT connection successfully established with broker.
I (31333) [MQTT OTA]: Success creating MQTT connection to hogehoge.iot.ap-northeast-1.amazonaws.com.
W (31343) ota_pal: otaPal_GetPlatformImageState
W (31343) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
W (31353) esp_ota_ops: _esp_get_otadata_partition: find_partition: Address:d000, Size:8192
W (31373) esp_ota_ops: _esp_get_otadata_partition: gen_0_seq:1, gen_1_seq:0
W (31383) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[0] Memory Dump
I (31393) esp_ota_ops: 0x3ffc3f98 01 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (31403) esp_ota_ops: 0x3ffc3fa8 ff ff ff ff ff ff ff ff ff ff ff ff 9a 98 43 47 |..............CG|
W (31413) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[1] Memory Dump
I (31413) esp_ota_ops: 0x3ffc3fb8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (31423) esp_ota_ops: 0x3ffc3fc8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (31433) esp_ota_ops: [0] aflags/seq:0xffffffff/0x1, pflags/seq:0xffffffff/0x0
W (31443) AWS_OTA: otaPal_GetPlatformImageState -> eImageState: OtaPalImageStateInvalid
I (31453) AWS_OTA: Current State=[RequestingJob], Event=[Start], New state=[RequestingJob]
I (39753) AWS_OTA: Extracted parameter: [key: value]=[execution.jobDocument.afr_ota.protocols: ["MQTT"]]
I (39763) AWS_OTA: Extracted parameter: [key: value]=[filepath: /update]
I (39773) AWS_OTA: Extracted parameter: [key: value]=[filesize: 538704]
I (39783) AWS_OTA: Extracted parameter: [key: value]=[fileid: 0]
I (39783) AWS_OTA: Extracted parameter: [key: value]=[certfile: /OTA_Cert/auth.pem]
I (39793) AWS_OTA: Extracted parameter [ sig-sha256-ecdsa: MEQCIGpoXquTdZr8Kalx+yUq2EFLKaf2... ]
I (39803) AWS_OTA: In self test mode.
I (39813) AWS_OTA: New image has a higher version number than the current image: New image version=4.1.2, Previous image version=0.1.2
I (39823) AWS_OTA: Image version is valid: Begin testing file: File ID=0
W (39833) AWS_OTA: handleSelfTestJobDoc -> setImageStateWithReason: OtaImageStateTesting
W (39833) AWS_OTA: setImageStateWithReason -> setPlatformImageState:1
I (39843) ota_pal: otaPal_SetPlatformImageState, 1
W (39853) ota_pal: Set image as testing!
I (46233) [MQTT OTA]: PINGRESP should not be handled by the application callback when using MQTT_ProcessLoop.
I (46243) [MQTT OTA]: Sent PUBLISH packet to broker $aws/things/device-linkage-test/jobs/AFR_OTA-ota-test/update to broker.
I (46253) [MQTT OTA]: Sent PUBLISH message: {"status":"IN_PROGRESS","statusDetails":{"self_test":"active","updatedBy":"0x04010002"}}
I (46283) AWS_OTA: Job parsing success: OtaJobParseErr_t=OtaJobParseErrNone, Job name=AFR_OTA-ota-test
I (46293) [MQTT OTA]: Received OtaJobEventReceivedJob callback from OTA Agent.
W (46293) ota_pal: otaPal_GetPlatformImageState
W (46303) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
W (46313) esp_ota_ops: _esp_get_otadata_partition: find_partition: Address:d000, Size:8192
W (46313) esp_ota_ops: _esp_get_otadata_partition: gen_0_seq:1, gen_1_seq:0
W (46323) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[0] Memory Dump
I (46333) esp_ota_ops: 0x3ffc3f78 01 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46343) esp_ota_ops: 0x3ffc3f88 ff ff ff ff ff ff ff ff ff ff ff ff 9a 98 43 47 |..............CG|
W (46353) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[1] Memory Dump
I (46363) esp_ota_ops: 0x3ffc3f98 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46383) esp_ota_ops: 0x3ffc3fa8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46393) esp_ota_ops: [0] aflags/seq:0xffffffff/0x1, pflags/seq:0xffffffff/0x0
W (46403) AWS_OTA: otaPal_GetPlatformImageState -> eImageState: OtaPalImageStateInvalid
I (46403) [MQTT OTA]: Received OtaJobEventProcessed callback from OTA Agent.
I (46413) AWS_OTA: Current State=[CreatingFile], Event=[ReceivedJobDocument], New state=[CreatingFile]
I (46423) AWS_OTA: Beginning self-test.
W (46433) ota_pal: otaPal_GetPlatformImageState
W (46433) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
W (46443) esp_ota_ops: _esp_get_otadata_partition: find_partition: Address:d000, Size:8192
W (46453) esp_ota_ops: _esp_get_otadata_partition: gen_0_seq:1, gen_1_seq:0
W (46453) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[0] Memory Dump
I (46463) esp_ota_ops: 0x3ffc3fa8 01 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46473) esp_ota_ops: 0x3ffc3fb8 ff ff ff ff ff ff ff ff ff ff ff ff 9a 98 43 47 |..............CG|
W (46483) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[1] Memory Dump
I (46493) esp_ota_ops: 0x3ffc3fc8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46503) esp_ota_ops: 0x3ffc3fd8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46513) esp_ota_ops: [0] aflags/seq:0xffffffff/0x1, pflags/seq:0xffffffff/0x0
W (46523) AWS_OTA: otaPal_GetPlatformImageState -> eImageState: OtaPalImageStateInvalid
W (46533) AWS_OTA: Rejecting new image and rebooting:The job is in the self-test state while the platform is not.
W (46543) AWS_OTA: inSelfTestHandler -> setImageStateWithReason: OtaImageStateRejected
W (46553) AWS_OTA: setImageStateWithReason -> setPlatformImageState:3
I (46553) ota_pal: otaPal_SetPlatformImageState, 3
W (46563) ota_pal: Set image as invalid!
W (46563) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
W (46573) esp_ota_ops: _esp_get_otadata_partition: find_partition: Address:d000, Size:8192
W (46583) esp_ota_ops: _esp_get_otadata_partition: gen_0_seq:1, gen_1_seq:0
W (46593) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[0] Memory Dump
I (46593) esp_ota_ops: 0x3ffc3f88 01 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46603) esp_ota_ops: 0x3ffc3f98 ff ff ff ff ff ff ff ff ff ff ff ff 9a 98 43 47 |..............CG|
W (46613) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[1] Memory Dump
I (46623) esp_ota_ops: 0x3ffc3fa8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46633) esp_ota_ops: 0x3ffc3fb8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46643) esp_ota_ops: [0] aflags/seq:0xffffffff/0x1, pflags/seq:0xffffffff/0x0
W (46653) ota_pal: Image not in self test mode 4294967295
W (46663) esp_ota_ops: aws_esp_ota_get_boot_flags: 1
W (46663) esp_ota_ops: _esp_get_otadata_partition: find_partition: Address:d000, Size:8192
W (46673) esp_ota_ops: _esp_get_otadata_partition: gen_0_seq:1, gen_1_seq:0
W (46683) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[0] Memory Dump
I (46693) esp_ota_ops: 0x3ffc3f88 01 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46703) esp_ota_ops: 0x3ffc3f98 ff ff ff ff ff ff ff ff ff ff ff ff 9a 98 43 47 |..............CG|
W (46713) esp_ota_ops: _esp_get_otadata_partition: s_ota_select[1] Memory Dump
I (46723) esp_ota_ops: 0x3ffc3fa8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46733) esp_ota_ops: 0x3ffc3fb8 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
I (46743) esp_ota_ops: [0] aflags/seq:0xffffffff/0x1, pflags/seq:0xffffffff/0x0
I (48253) [MQTT OTA]: Sent PUBLISH message: {"status":"FAILED","statusDetails":{"reason":"rejected: 0x00000011"}}
@kstribrnAmzn We found the cause. It was because CONFIG_BOOTLOADER_APP_ROLLBACK_ENABLE was not enabled. I had thought it was just a setting to enable rollback, so I had disabled it first, But I see that by enabling it, the function to update the ota_state flag in the boot loader is also enabled. I learned a great deal. https://docs.espressif.com/projects/esp-idf/en/v4.2/esp32/api-reference/system/ota.html#app-rollback
Thanks for taking time to report back :)
Closing this issue as your problem is solved
I tried running the MQTT OTA sample using the Cellular interface Library. When I executed the job, block transfer started using MQTT communication. After receiving all the blocks, the device restarted, and I confirmed that the app version had been updated to a new one. However, for the OTA job, it published status: FAILED, and it also showed as failed on the AWS console.
I suspect that the cause of this issue is that the values of aflag and pflag read from _esp_get_otadata_partition are both 0xffffffff. However, I am just running the sample as is. Do I need any additional implementation?