aws / aws-iot-device-sdk-embedded-C

SDK for connecting to AWS IoT from a device using embedded C.
MIT License
975 stars 623 forks source link

Segmentation fault after OtaJobEventSelfTestFailed within 20s #1814

Closed Nomidia closed 1 year ago

Nomidia commented 2 years ago

Steps to reproduce:

  1. After downloading the firmware, the otaThread will exit.
  2. Then restart the demo, it will in self test mode, and OtaSelfTestTimer will be created
  3. Since the version number has not changed, OTA_Shutdown will be invoked
  4. In shutdownHandler, otaAgent will be clear
  5. After several seconds the OtaSelfTestTimer expires, otaAgent.pOtaInterface->pal.reset will cause segfaul

segfault.log

I think it will be better if we can stop the timer.

dachalco commented 2 years ago

Hi @Nomidia

Thank you for the detailed back trace and logs. This issue is mainly applicable to the linux platform port we supply for the demos, as it does not implement pal.reset. On an actual MCU, this trace should end at step 4, where after returning from user's OtaJobEventSelfTestFailed handler the MCU is reset.

However, I think you've still pointed out a valid race condition as it's theoretically possible the selftest timer expires before reset is called, if the test duration were unreasonably tiny enough. I think you're right, the self-test timer should be stopped here. I'll prepare a PR.

ActoryOu commented 1 year ago

I believe this has been fixed by commit in OTA repo. It has been patched into this repo, too. So, I'm going to close this issue.

Thanks a lot!