dhewg / esphome-miot

ESPHome components for MIoT devices
Other
42 stars 17 forks source link

Guru Meditation Error in miot.cpp #11

Closed pchmura4 closed 8 months ago

pchmura4 commented 8 months ago

Hi, I have flashed the firmware on Mi Purifier 3C (single core ESP32), and it's experiencing a bootloop. The flash chip was erased completely before proceeding.

[C][api:140]:   Address: purifier.local:6053
[C][api:142]:   Using noise encryption: YES
Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x40001277  PS      : 0x00060f30  A0      : 0x800db40b  A1      : 0x3ffbccb0
A2      : 0x00000000  A3      : 0x3f405006  A4      : 0x3ffbccf8  A5      : 0x00000004
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x8000c71a  A9      : 0x3ffbcc80
A10     : 0x00000000  A11     : 0x3f41f66b  A12     : 0x3ffbccf8  A13     : 0x00000001
A14     : 0x00ff0000  A15     : 0xff000000  SAR     : 0x0000000a  EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff

Backtrace: 0x40001274:0x3ffbccb0 0x400db408:0x3ffbccc0 0x400db5b1:0x3ffbcd20 0x4015f91d:0x3ffbcd50 0x4015f9a5:0x3ffbcd70 0x400e2020:0x3ffbcd90 0x400e4862:0x3ffbcdc0 0x400d9242:0x3ffbcde0

Decoded stack trace below:

0x400db408: esphome::miot::Miot::process_message_(char*) at /config/.esphome/build/purifier/src/esphome/components/miot/miot.cpp:324
0x400db5b1: esphome::miot::Miot::loop() at /config/.esphome/build/purifier/src/esphome/components/miot/miot.cpp:96
0x4015f91d: esphome::Component::call_loop() at /config/.esphome/build/purifier/src/esphome/core/component.cpp:76
0x4015f9a5: esphome::Component::call() at /config/.esphome/build/purifier/src/esphome/core/component.cpp:98
0x400e2020: esphome::Application::loop() at /config/.esphome/build/purifier/src/esphome/core/application.cpp:74
0x400e4862: loop() at /config/.esphome/build/purifier/zhimi.airp.mb4a.yaml:44
0x400d9242: esphome::loop_task(void*) at /config/.esphome/build/purifier/src/esphome/components/esp32/core.cpp:69
dhewg commented 8 months ago

I assume you're using the most recent commit, in which case my first guess is that the esp idf strcmp function assumes the args are valid and non-null.

Does this help?

diff --git a/components/miot/miot.cpp b/components/miot/miot.cpp
index f33239a..c604b5c 100644
--- a/components/miot/miot.cpp
+++ b/components/miot/miot.cpp
@@ -321,7 +321,8 @@ void Miot::process_message_(char *msg) {
   } else if (cmd == "net") {
     send_reply_(get_net_reply_());
   } else if (cmd == "time") {
-    bool posix = std::strcmp(strtok_r(nullptr, " ", &saveptr), "posix") == 0;
+    const char *arg = strtok_r(nullptr, " ", &saveptr);
+    bool posix = arg && std::strcmp(arg, "posix") == 0;
     send_reply_(get_time_reply_(posix).c_str());
   } else if (cmd == "mac") {
     send_reply_(get_mac_address().c_str());
pchmura4 commented 8 months ago

Yes, I cloned the repo just today, esphome is the newest official Docker image. Unfortunately this did not help. The stack trace still points to the line where the string is compared

dhewg commented 8 months ago

What if you just use bool posix = false; then? And can you please increase the log level to VERBOSE and paste the few lines before the crash?

pchmura4 commented 8 months ago

Now it connects to HA, and reboots The config steps are taken from there https://sourcediver.org/posts/240218_unclouded_purifier/

[D][select:015]: 'Mode': Sending state Auto (index 0)
[D][number:012]: 'Favorite Speed': Sending state 2000.000000
[D][sensor:094]: 'PM2.5 Density': Sending state 37.00000 µg/m³ with 0 decimals of accuracy
[D][sensor:094]: 'Filter Life Level': Sending state 43.00000 % with 0 decimals of accuracy
[D][sensor:094]: 'Filter Used Time': Sending state 2004.00000 h with 0 decimals of accuracy
[W][miot:201]: Received property value without component: 7 1 true
[D][number:012]: 'Display Brightness': Sending state 8.000000

 Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x40001277  PS      : 0x00060f30  A0      : 0x800db40b  A1      : 0x3ffbccb0
A2      : 0x00000000  A3      : 0x3f405006  A4      : 0x3ffbccf8  A5      : 0x00000004
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x8000c71a  A9      : 0x3ffbcc80
A10     : 0x00000000  A11     : 0x3f41f66b  A12     : 0x3ffbccf8  A13     : 0x00000001
A14     : 0x00ff0000  A15     : 0xff000000  SAR     : 0x00000018  EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff

Backtrace: 0x40001274:0x3ffbccb0 0x400db408:0x3ffbccc0 0x400db5b1:0x3ffbcd20 0x4015f91d:0x3ffbcd50 0x4015f9a5:0x3ffbcd70 0x400e2020:0x3ffbcd90 0x400e4862:0x3ffbcdc0 0x400d9242:0x3ffbcde0
dhewg commented 8 months ago

Hm, that looks fishy... What esphome version do you use to build the firmware? (run esphome version) Does a esphome clean my_purifier.yaml followed by a rebuild and upload fix it? /cc @mguentner does a firmware including the latest commit work on your 3c?

pchmura4 commented 8 months ago

I was running esphome clean my_purifier.yaml each time I was changing something, just to be sure.

# esphome version
Version: 2024.2.1
dhewg commented 8 months ago

Hm okay....

A build with that esphome version works just fine on my 4lite, let's wait for mguentner and his 3c test results.

pchmura4 commented 8 months ago

BTW this looks like a really old ESP32 revision, maybe that could be also something to look at

esptool.py v4.7.0
Serial port COM7
Connecting.......
Detecting chip type... Unsupported detection protocol, switching and trying again...
Connecting...
Detecting chip type... ESP32
Chip is unknown ESP32 (revision v1.0)
Features: WiFi, BT, Single Core, 240MHz, VRef calibration in efuse, Coding Scheme 3/4
Crystal is 40MHz
MAC: 58:b6:23:30:09:65
Uploading stub...
Running stub...
Stub running...
Configuring flash size...
Flash will be erased from 0x00010000 to 0x000dffff...
Compressed 848048 bytes to 565746...
Wrote 848048 bytes (565746 compressed) at 0x00010000 in 50.3 seconds (effective 134.8 kbit/s)...
Hash of data verified.
mguentner commented 8 months ago

I am running 2023.12.9 originally with a77cb9c, now 7eb6e96. Both give me the same panic as @pchmura4 has observed:

                   Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.

Core  0 register dump:
PC      : 0x40001277  PS      : 0x00060f30  A0      : 0x800db3fb  A1      : 0x3ffbcc10  
A2      : 0x00000000  A3      : 0x3f404ffa  A4      : 0x3ffbcc58  A5      : 0x00000004  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x8000c71a  A9      : 0x3ffbcbe0  
A10     : 0x00000000  A11     : 0x3f41f58b  A12     : 0x3ffbcc58  A13     : 0x00000001  
A14     : 0x00ff0000  A15     : 0xff000000  SAR     : 0x0000000a  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  

I wouldn't have noticed if not for this issue as I never realized that my device was doing reboots :sweat_smile: . Since only the ESP32 reboots, the overall state remains good.

dhewg commented 8 months ago

I can only guess with those bits of information.

Maybe it's related to the serial communications, and I just pushed the test (so replace the external components url part @main with @test) branch with some tiny additions. Does that improve the situation? Please always run with VERBOSE and paste more log lines, let's say at least 20 before the crash.

Maybe it's due to the dual uarts (debug & mcu), does reducing the log level yield less crashes?

pchmura4 commented 8 months ago

Works fine now, that was a good find!

dhewg commented 8 months ago

Huh :D

There're three commits on that test branch, can you pin which one actually fixes the issue?

pchmura4 commented 8 months ago

We tested this before without success https://github.com/dhewg/esphome-miot/commit/4317000b3825eb5429f17d7b123d5efd490cf4e7 My bet would be the two that's left probably, hard to tell now after assembling the unit 😅

dhewg commented 8 months ago

We tested this before without success 4317000

Not exactly, that one has an addition check.

But I just pushed the other two commit to main, does that still work?

mguentner commented 8 months ago

I still see the issue both on main and test. Serial Log with VERBOSE

[V][miot:189]: Sending reply 'lan' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'net'
[V][miot:189]: Sending reply 'lan' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'net'
[V][miot:189]: Sending reply 'lan' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'net'
[V][miot:189]: Sending reply 'lan' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'get_down'
[V][miot:189]: Sending reply 'down none' to MCU
[V][miot:095]: Received MCU message 'net'
[V][miot:189]: Sending reply 'lan' to MCU
[V][miot:095]: Received MCU message 'time'Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Excep
tion was unhandled.

Core  0 register dump:
PC      : 0x40001277  PS      : 0x00060f30  A0      : 0x800db7f7  A1      : 0x3ffbcc10  
A2      : 0x00000000  A3      : 0x3f4052f0  A4      : 0x3ffbcc58  A5      : 0x00000004  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x8000c71a  A9      : 0x3ffbcbe0  
A10     : 0x00000000  A11     : 0x3f420037  A12     : 0x3ffbcc58  A13     : 0x00000001  
A14     : 0x00ff0000  A15     : 0xff000000  SAR     : 0x00000004  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400014fd  LEND    : 0x4000150d  LCOUNT  : 0xffffffff  

Backtrace: 0x40001274:0x3ffbcc10 0x400db7f4:0x3ffbcc20 0x400db9b1:0x3ffbcc80 0x4015faa9:0x3ffbccb0 0x4015
fb31:0x3ffbccd0 0x400e29bc:0x3ffbccf0 0x400e5212:0x3ffbcd20 0x400d94ea:0x3ffbcd40

I seems to be related to the time message.

pchmura4 commented 8 months ago

I can second that, it's related to time. In my case I have a warning that it's not able to get MCU time, probably that's why it's not crashing in my case

mguentner commented 8 months ago

Now it works. I actually compiled the wrong version of esphome-miot as the device yaml again referenced github instead of a local path.

dhewg commented 8 months ago

I was about to say that every test result contradicts another... but okay, that's good! ;)

Just to make sure I understood correctly: main as-is works, and the additional strcmp patch on the test branch isn't required?

mguentner commented 8 months ago

Just tested: main: exception is thrown test: works fine

dhewg commented 8 months ago

Alright, thanks for the confirmation!

It's all part of the main branch now, let me know if there're any further issues.