sharandac / My-TTGO-Watch

A GUI named hedge for smartwatch like devices based on ESP32. Currently support for T-Watch2020 (V1,V2,V3), T-Watch2021, M5Paper, M5Core2 and native Linux support for testing.
GNU General Public License v2.0
523 stars 247 forks source link

Reboot loop with 2020 v2 #368

Closed nicolasff closed 1 year ago

nicolasff commented 1 year ago

Hello!

Thanks for building this software. I've been trying to run it recently but am having some persistent issues that I haven't been able to track down in this large code base.

Build tool used:

used Hardware:

Description of problem:

I used PlatformIO and Visual Studio Code to flash a build to my 2020 v2 watch, using a checkout on the latest commit (7ab2f2fc23b5cafc0d6293efd1107e7b750d5526). I am using the env:t-watch2020-v2 build target, and have only used this build target.

The main issues are:

I did make some changes before building:

Additional information and things you've tried:

I also tried formatting the IPFFS as recommended in this issue, but it didn't help.

I also tried listing the files saved on IPFFS at startup (in setup()) and saw that even though there were some JSON files (some new ones with each successive reboot), ~they all seemed to be empty (I used File f = SPIFFS.open(…) and then saw that f.size() was always zero)~ edit: I was reading them wrong by using the /ipffs prefix, the files that get created are not actually empty.

I captured a long sequence of logs from a fresh build that I ran immediately after doing an IPFFS format, with the extra debug info listing the files in IPFFS at each reboot. The file is very long, so I'm attaching separately in this Gist.

Sequence of errors:

The first boot shows that the firmware tries to load a number of JSON files, none of which are found. The lines all look like this:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /device.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /motor.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /display.json

Then on first boot the watch crashes with a Core 1 panic'ed (LoadProhibited) and the following stack trace:

Backtrace: 0x4000127a:0x3ffda0f0 0x401149fa:0x3ffda100 0x4019bb41:0x3ffda140 0x4019bc20:0x3ffda260 0x40094d42:0x3ffda290
  #0  0x4000127a:0x3ffda0f0 in ?? ??:0
  #1  0x401149fa:0x3ffda100 in std::_Function_handler<void (system_event_id_t, system_event_info_t), wifictl_setup()::{lambda(system_event_id_t, system_event_info_t)#2}>::_M_invoke(std::_Any_data const&, system_event_id_t&&, system_event_info_t&&) at src/hardware/wifictl.cpp:141
      (inlined by) _M_invoke at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:1871
  #2  0x4019bb41:0x3ffda140 in std::function<void (system_event_id_t, system_event_info_t)>::operator()(system_event_id_t, system_event_info_t) const at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:2114
      (inlined by) WiFiGenericClass::_eventCallback(void*, system_event_t*, wifi_prov_event_t*) at /Users/localuser/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiGeneric.cpp:475
  #3  0x4019bc20:0x3ffda260 in _network_event_task(void*) at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:2114
  #4  0x40094d42:0x3ffda290 in vPortTaskWrapper at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)

The second boot found the following files on IPFFS at startup:

It panicked after a few seconds with InstrFetchProhibited and this stack trace:

Backtrace: 0x3a385a21:0x3ffd4920 0x401929fe:0x3ffd4980 0x40195535:0x3ffd49c0 0x401925d2:0x3ffd49e0 0x4019650d:0x3ffd4a20 0x4017040a:0x3ffd4a40 0x40195960:0x3ffd4a70 0x400e06f1:0x3ffd4af0 0x400deccf:0x3ffd4b30 0x40115787:0x3ffd4b50 0x401d5ada:0x3ffd4ca0 0x40094d42:0x3ffd4cc0
  #0  0x3a385a21:0x3ffd4920 in ?? ??:0
  #1  0x401929fe:0x3ffd4980 in lv_page_scrollable_signal at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_widgets/lv_page.c:958
  #2  0x40195535:0x3ffd49c0 in lv_textarea_scrollable_signal at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_widgets/lv_textarea.c:1154
  #3  0x401925d2:0x3ffd49e0 in lv_page_signal at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_widgets/lv_page.c:849
  #4  0x4019650d:0x3ffd4a20 in lv_textarea_signal at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_widgets/lv_textarea.c:1411
  #5  0x4017040a:0x3ffd4a40 in lv_obj_set_size at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_core/lv_obj.c:3744
  #6  0x40195960:0x3ffd4a70 in lv_textarea_create at .pio/libdeps/t-watch2020-v2/TTGO TWatch Library/src/lvgl/src/lv_widgets/lv_textarea.c:1154
  #7  0x400e06f1:0x3ffd4af0 in kodi_remote_app_setup_setup(unsigned int) at src/app/kodi_remote/kodi_remote_app_setup.cpp:94
  #8  0x400deccf:0x3ffd4b30 in kodi_remote_app_setup() at src/app/kodi_remote/kodi_remote_app.cpp:89
  #9  0x40115787:0x3ffd4b50 in setup() at src/main.cpp:137
  #10 0x401d5ada:0x3ffd4ca0 in loopTask(void*) at /Users/localuser/.platformio/packages/framework-arduinoespressif32/cores/esp32/main.cpp:18
  #11 0x40094d42:0x3ffd4cc0 in vPortTaskWrapper at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)

The third boot found the same two files, plus:

before crashing with InstrFetchProhibited at:

Backtrace: 0x40275c71:0x3ffb5e90 0x4027555f:0x3ffb5eb0 0x402154d1:0x3ffb5f70 0x4021558d:0x3ffb5fa0 0x4021af0d:0x3ffb5fd0 0x40281406:0x3ffb5ff0 0x40094d42:0x3ffb6020
  #0  0x40275c71:0x3ffb5e90 in register_chipv7_phy at /home/aiqin/git_tree/chip7.1_phy/chip_7.1/board_code/app_test/pp/phy/phy_chip_v7.c:1362
  #1  0x4027555f:0x3ffb5eb0 in bb_init at /home/aiqin/git_tree/chip7.1_phy/chip_7.1/board_code/app_test/pp/phy/phy_chip_v7.c:1362
  #2  0x402154d1:0x3ffb5f70 in esp_phy_rf_init at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/esp32/phy_init.c:524
  #3  0x4021558d:0x3ffb5fa0 in esp_modem_sleep_exit at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/esp32/phy_init.c:524
  #4  0x4021af0d:0x3ffb5fd0 in btdm_sleep_exit_phase3_wrapper at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/bt/bt.c:1708
  #5  0x40281406:0x3ffb5ff0 in btdm_controller_task at ??:?
  #6  0x40094d42:0x3ffb6020 in vPortTaskWrapper at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)

There's a lot more, the Gist with more logs has all the details for 8 successive boot sequences.

Is there anything I did wrong in setting this up, or is there anything I can try to resolve this?

Thanks in advance.

sharandac commented 1 year ago

Thank you for your feedback! I tried to reproduce the error, but unfortunately I did not succeed.

Could you please test the following and give me feedback? This should allow us to have a defined initial state. And if possible please use the last master. The absence of the json is normal. If these are not available, a default config is used. Only when changes occur are the json files created.

flash and run

1.) erase Flash 2.) build and upload flash firmware

After that the Watch should start and reformat the SPIFFS first, this may take a while. And if everything works, the watch should start correctly. At least that's the hope.

E (2685) SPIFFS: mount failed, -10025
[E][SPIFFS.cpp:89] begin(): Mounting SPIFFS failed! Error: -1
[E][vfs_api.cpp:22] open(): File system is not mounted
[E][basejsonconfig.cpp:72] load(): Can't open file: /device.json!
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /device.json
[I][device.cpp:40] device_setup(): set device name to 'T-Watch2020V2'
[W][sd_diskio.cpp:471] ff_sd_initialize(): GO_IDLE_STATE failed
[E][sd_diskio.cpp:741] sdcard_mount(): f_mount failed 0x(3)
[E][sdcard.cpp:95] sdcard_setup(): SD Card Mount Failed
[E][vfs_api.cpp:22] open(): File system is not mounted
[E][basejsonconfig.cpp:72] load(): Can't open file: /motor.json!
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /motor.json
[E][vfs_api.cpp:22] open(): File system is not mounted
[E][basejsonconfig.cpp:72] load(): Can't open file: /display.json!
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /display.json
[I][framebuffer.cpp:160] framebuffer_setup(): framebuffer 1: 0x0x3ffd4e30 (4800 bytes, 240x10px)
[I][framebuffer.cpp:177] framebuffer_setup(): framebuffer 2: 0x0x3ffd6100 (4800 bytes, 240x10px)
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[I][splashscreen.cpp:75] splash_screen_stage_one(): use default boot logo
E (5261) SPIFFS: mount failed, -10025
[E][SPIFFS.cpp:89] begin(): Mounting SPIFFS failed! Error: -1
[I][hardware.cpp:250] hardware_setup(): format SPIFFS
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /pmu.json
[I][pmu.cpp:106] pmu_setup(): init AXP202 pmu controller
[E][callback.cpp:248] callback_send(): no callback structure found
[W][motion.cpp:132] bma_setup(): stepcounter not valid. reset
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /bma.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /wificfg.json
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[E][callback.cpp:248] callback_send(): no callback structure found
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /touch.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /rtcctr.json
[E][callback.cpp:248] callback_send(): no callback structure found
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /timesync.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /blectl.json
[I][gui.cpp:146] gui_setup(): mainbar setup
[I][gui.cpp:151] gui_setup(): mainbar tile setup
[E][main_tile.cpp:527] main_tile_update_time(): maintile not initialized
[I][gui.cpp:153] gui_setup(): app tile setup
[I][gui.cpp:155] gui_setup(): note tile setup
[I][gui.cpp:157] gui_setup(): setup tile setup
[I][setup_tile.cpp:108] setup_tile_setup(): setup tile finish
[I][gui.cpp:162] gui_setup(): statusbar setup
[I][gui.cpp:164] gui_setup(): quickbar setup
[I][gui.cpp:166] gui_setup(): keyboard setup
[I][gui.cpp:168] gui_setup(): num keyboard setup
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /style.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /update.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /watchface/watchface_theme.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /watchface/watchface_theme.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /watchface.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /osmmap.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /weather.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /alarm.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /activity.json
[E][calendar_db.cpp:63] calendar_db_setup(): databsae not exist, create database and tables
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /ir-remote.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /fx-rates.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /powermeter.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /kodi_remote.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /sound.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /gpsctl.json
[E][gpsctl.cpp:116] gpsctl_setup(): set default gps RX on pin 36/TX on pin 26!
I NimBLEDevice: BLE Host Task Started
I NimBLEDevice: NimBle host synced.
W NimBLEAdvertising: Advertising already active
[I][hardware.cpp:297] hardware_post_setup(): Free heap: 115164
[I][hardware.cpp:298] hardware_post_setup(): Free PSRAM heap: 3883144
nicolasff commented 1 year ago

Thanks for the quick reply! I ran the Erase command as you suggested:

Erasing...
esptool.py v3.1
Serial port /dev/tty.wchusbserial54260125411
Connecting...
Failed to get PID of a device on /dev/tty.wchusbserial54260125411, using standard reset sequence.
.
Chip is ESP32-D0WDQ6-V3 (revision 3)
Features: WiFi, BT, Dual Core, 240MHz, VRef calibration in efuse, Coding Scheme None
Crystal is 40MHz
MAC: 30:c6:XX:XX:XX:XX
Uploading stub...
Running stub...
Stub running...
Erasing flash (this may take a while)...
Chip erase completed successfully in 40.2s
Hard resetting via RTS pin...

Then rebuilt:

[...]
Linking .pio/build/t-watch2020-v2/firmware.elf
Retrieving maximum program size .pio/build/t-watch2020-v2/firmware.elf
Checking size .pio/build/t-watch2020-v2/firmware.elf
Advanced Memory Usage is available via "PlatformIO Home > Project Inspect"
RAM:   [===       ]  25.6% (used 83864 bytes from 327680 bytes)
Flash: [=======   ]  66.4% (used 4348557 bytes from 6553600 bytes)
Building .pio/build/t-watch2020-v2/firmware.bin
esptool.py v3.1
Merged 1 ELF section
================================ [SUCCESS] Took 84.34 seconds ================================

By the way, I've noticed that a number of dependencies are listed as >= rather than exact versions, and I wonder if maybe I was pulling some versions that were more recent than what was expected, maybe with different behavior.

Could you maybe share the dependency graph that it produced for your build? This might help find the difference. The list is a bit long for this comment, so I've attached it to this Gist.

I removed every single change I had made in platformio.ini to get more logs, in order to get as close as possible to the latest master. The only change I have in platformio.ini now is the upload_port device, nothing else. This means that my new logs are less verbose and don't all have line numbers.

I noticed a difference in the first boot after erasing, I hadn't seen the initial SPIFFS erase before but it shows up clearly in the logs on first boot after an erase:

[E][SPIFFS.cpp:89] begin(): Mounting SPIFFS failed! Error: -1
[I][hardware.cpp:250] hardware_setup(): format SPIFFS

Comparing my first boot sequence with yours, I can see that they both log the same events until this point:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /style.json
[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /update.json

and then yours logs the watchface_theme.json file being loaded with defaults:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /watchface/watchface_theme.json

while mine crashed before that, with:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /update.json
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
Core 1 register dump:
PC      : 0x4000127a  PS      : 0x00060230  A0      : 0x80110b00  A1      : 0x3ffd9c90  
A2      : 0x3f8040a4  A3      : 0x00000000  A4      : 0x000000ff  A5      : 0x0000ff00  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x0000006e  A9      : 0x00eeeefe  
A10     : 0x00000003  A11     : 0x3ffddca4  A12     : 0x000000e0  A13     : 0x00000001  
A14     : 0x00000006  A15     : 0x00000008  SAR     : 0x0000001b  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x00000000  LBEG    : 0x400012e5  LEND    : 0x40001309  LCOUNT  : 0x80110ae7  

ELF file SHA256: 0000000000000000

Backtrace: 0x4000127a:0x3ffd9c90 0x40110afd:0x3ffd9ca0 0x40198829:0x3ffd9cd0 0x40198908:0x3ffd9dc0 0x40094d1e:0x3ffd9df0
  #0  0x4000127a:0x3ffd9c90 in ?? ??:0
  #1  0x40110afd:0x3ffd9ca0 in std::_Function_handler<void (system_event_id_t, system_event_info_t), wifictl_setup()::{lambda(system_event_id_t, system_event_info_t)#2}>::_M_invoke(std::_Any_data const&, system_event_id_t&&, system_event_info_t&&) at src/hardware/wifictl.cpp:141
      (inlined by) _M_invoke at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:1871
  #2  0x40198829:0x3ffd9cd0 in std::function<void (system_event_id_t, system_event_info_t)>::operator()(system_event_id_t, system_event_info_t) const at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:2114
      (inlined by) WiFiGenericClass::_eventCallback(void*, system_event_t*, wifi_prov_event_t*) at /Users/localuser/.platformio/packages/framework-arduinoespressif32/libraries/WiFi/src/WiFiGeneric.cpp:475
  #3  0x40198908:0x3ffd9dc0 in _network_event_task(void*) at /Users/localuser/.platformio/packages/toolchain-xtensa32/xtensa-esp32-elf/include/c++/5.2.0/functional:2114
  #4  0x40094d1e:0x3ffd9df0 in vPortTaskWrapper at /home/sharan/temp/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c:355 (discriminator 1)

I've uploaded the full logs for the "Upload & Monitor" task that I ran after erasing the flash to a new Gist, it shows 17 successive reboots (I just collected the logs there but it continued).

git diff shows that the only changes I have compared to the latest master are:

Nothing else. So the log level is back to -DCORE_DEBUG_LEVEL=3 and build_type is release; I had changed both of those earlier to get detailed logs.

sharandac commented 1 year ago

The logfile looks interesting, even if it doesn't help. What surprises me is that there is no real fixed point at which the ESP32 crashes, but it always looks very random. As if there are more problems with the power supply. Have you tried running the clock without a battery, only via USB? Or vice versa, only with battery and without USB?

Crsarmv7l commented 1 year ago

You listed a lot of code changes you made. Have you tried building and installing just the master version without your changes just to see if it gets through boot? I have caused a few boot loops with my edits.

nicolasff commented 1 year ago

@Crsarmv7l as mentioned in my last post:

git diff shows that the only changes I have compared to the latest master are:

Setting upload_port in platformio.ini Setting the time zone in src/hardware/config/timesyncconfig.h, as described in the original post Adding my WiFi credentials to src/hardware/wifictl.cpp in these two lines

None of these changes should lead to so many crashes.

@sharandac I tried 4 different USB cables, each with and without the battery. The behavior is the same.

I thought I could focus on just one crash and see if I can understand it. I erased the flash again and re-uploaded the build, and recognized a crash I had seen before, in wifictl.cpp:

Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
[...]
Backtrace: 0x4000127a:0x3ffd9c90 0x40110afd:0x3ffd9ca0 0x40198829:0x3ffd9cd0 0x40198908:0x3ffd9dc0 0x40094d1e:0x3ffd9df0
  #0  0x4000127a:0x3ffd9c90 in ?? ??:0
  #1  0x40110afd:0x3ffd9ca0 in std::_Function_handler<void (system_event_id_t, system_event_info_t), wifictl_setup()::{lambda(system_event_id_t, system_event_info_t)#2}>::_M_invoke(std::_Any_data const&, system_event_id_t&&, system_event_info_t&&) at src/hardware/wifictl.cpp:141

The crash happens one level below src/hardware/wifictl.cpp:141, which is this line: https://github.com/sharandac/My-TTGO-Watch/blob/7ab2f2fc23b5cafc0d6293efd1107e7b750d5526/src/hardware/wifictl.cpp#L141

There are a lot of different variables here, so I dumped them all:

[I][wifictl.cpp:141] operator()(): strcmp: entry=0, i=0, wifictl_config=0x3f803fa0
[I][wifictl.cpp:143] operator()(): wifictl_config->networklist=0x3f8040a4
[I][wifictl.cpp:145] operator()(): about to de-reference wifictl_config->networklist[entry]
[I][wifictl.cpp:147] operator()(): list[entry]: ssid=0x3ffd9c0c, password=0x3ffd9c4c
[E][wifictl.cpp:157] operator()(): wifictl_config->networklist_tried is NULL
[I][wifictl.cpp:162] operator()(): about to de-reference WiFi.SSID(i).c_str(), with i=0
[I][wifictl.cpp:163] operator()(): WiFi.SSID(i).c_str()=homenetwork
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.

The issue is logged above as an error: wifictl_config->networklist_tried is null. The if condition has two strcmp:

d03n3rfr1tz3 commented 1 year ago

Just some wild guesses from my side, but maybe it helps: The reason for the crashs might be your third edit, therefore the one in src/hardware/wifictl.cpp. Or at least thats the difference on why sharandac couldnt reproduce it.

Why? Well, the crash looks somewhat random, but what if it just seems random? What if the initial crash occurs after an event which may differ in times, like a finished WiFi scan with a (long? strange?) SSID that crashes saving the wifictl.json. And everything after that is just the result of the first crash. If wifictl_config->networklist_tried is NULL, that means wifictl_config_t::onLoad did not run. The line you edited results in a call to wifictl_insert_network, in which the copy you mentioned happens, which then leads to the crash.

Theoretically wifictl_config_t::onLoad should have happened as a result of this and therefore wifictl_config->networklist_tried would be allocated before this, but if wifictl_config::load() would find an empty wifictl.json (size 0), it would not call wifictl_config_t::onLoad (because of this) and therefore wifictl_config->networklist_tried would is still be NULL, which would lead to the crash in wifictl_insert_network. Why the empty wifi json? Well, a crash while trying to save it might be the reason. The reason for that might differ and is probably not easy to find. As mentioned above, maybe it occurs after a WiFi scan or whatever.

So we are back to what sharandac suggested, even if your changes seem to not cause the crash, it might still lead to them. I'm 100% sure that I never edited the foo-bar line, but maybe thats something new from the last few months. Obviously it might help to add a NULL check before the access to wifictl_config->networklist_tried, but the empty json in your spiffs is more alarming. Just try to run the unedited master (with wiping your SPIFFS of course) and if that works, we still can investigate, why that change might lead to all these crashes.

Crsarmv7l commented 1 year ago

Yep as @d03n3rfr1tz3 said if wifictl is crashing it could be due to the json/because of your edit to add your network.

While you may not think your edits are causing the crash, it is a prudent thing to rule out. At the least it should provide new data points if master with no changes also keeps crashing.

nicolasff commented 1 year ago

@d03n3rfr1tz3 thanks for the deep-dive into this wifictl.cpp crash! This made me look closer at it. I thought that just adding my credentials there would be harmless, especially given the comment suggesting "change your network here for first use", but it seems like that might not be the case. I agree that starting with a completely unchanged checkout would be the thing to try next.

After removing the changes listed above (WiFi credentials, time zone), cleaning up all the local builds and cached libraries with Clean All and resetting the watch with Erase Flash, I finally got a first clean boot! But… it didn't get very far.

I even managed to calibrate the touch inputs and then slowly enter my credentials using the WiFi configuration app on the watch itself, although when I saved them I saw the WiFi status icon at the top right change (guessing it connected), and then the watch entered a reboot loop once again.

This time, to be clear, this is with absolutely no changes at all to the code base or config, with a clean master checkout on 7ab2f2fc23b5cafc0d6293efd1107e7b750d5526.

There are a few reboots in this sequence that I had to initiate, since the logs just stopped and the device seemed frozen with nothing displayed on screen and no touch inputs registering whatsoever. Usually those were either immediately after attempting to load /style.json, or shortly after. These are logged with POWERON_RESET, and they happened only when the device was completely unresponsive. The vast majority of reboots happened without any input at all.

Some crashes are definitely surprising, for example this one in framebuffer_setup() from log_printf of all places, which is what log_i expands to. Where this should be logged:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /display.json
[I][framebuffer.cpp:160] framebuffer_setup(): framebuffer 1: 0x0x3ffd57a4 (4800 bytes, 240x10px)
[I][framebuffer.cpp:177] framebuffer_setup(): framebuffer 2: 0x0x3ffd6a74 (4800 bytes, 240x10px)

for this crash the first log line was not even seen:

[W][basejsonconfig.cpp:113] load(): reading json failed, call defaults, file: /display.json
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.
[...]
  #0  0x401d755e:0x3ffd4700 in _svfprintf_r at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vfprintf.c:1199 (discriminator 151)
  #1  0x401e14ba:0x3ffd4a10 in _vsnprintf_r at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vsnprintf.c:72
  #2  0x401e14f6:0x3ffd4aa0 in vsnprintf at /Users/ivan/e/newlib_xtensa-2.2.0-bin/newlib_xtensa-2.2.0/xtensa-esp32-elf/newlib/libc/stdio/../../../.././newlib/libc/stdio/vsnprintf.c:41
  #3  0x401d053f:0x3ffd4ae0 in log_printf at /Users/localuser/.platformio/packages/framework-arduinoespressif32/cores/esp32/esp32-hal-uart.c:533
  #4  0x4010ccd5:0x3ffd4b40 in framebuffer_setup() at src/hardware/framebuffer.cpp:259
  #5  0x4010ca07:0x3ffd4ba0 in display_setup() at src/hardware/display.cpp:419

It's unclear whether it happens in the first log_i or in the second with no time to send the first line, but either way neither of these two lines are de-referencing *anything*, so it's unclear what "load" was prohibited: https://github.com/sharandac/My-TTGO-Watch/blob/7ab2f2fc23b5cafc0d6293efd1107e7b750d5526/src/hardware/framebuffer.cpp#L160 and https://github.com/sharandac/My-TTGO-Watch/blob/7ab2f2fc23b5cafc0d6293efd1107e7b750d5526/src/hardware/framebuffer.cpp#L177

Once again here are the full logs with 13 reboots, from a first run of Upload and monitor with a clean checkout and right after an Erase Flash (as evidenced by the first boot formatting SPIFFS).

sharandac commented 1 year ago

Thank you for the feedback. When I look at the log files, it looks totally random again. In between, "POWERON_RESET" appears as described by you, although this should actually only happen when switching on for the first time. The randomly appearing sequence of "IllegalInstruction" and "LoadProhibited" actually suggests that there is a power problem. Could you change the following lines in the file and then rebuild everything again?

Please comment out

https://github.com/sharandac/My-TTGO-Watch/blob/7ab2f2fc23b5cafc0d6293efd1107e7b750d5526/src/hardware/hardware.cpp#L282

and change DISPLAY_MAX_BRIGHTNESS to 16

https://github.com/sharandac/My-TTGO-Watch/blob/7ab2f2fc23b5cafc0d6293efd1107e7b750d5526/src/hardware/config/displayconfig.h#L32

This should limit the power consumption at the first start. I think the search for the actual error could be quite difficult now :)