cesanta / mDash

Arduino / ESP-IDF library for mdash.net IoT service
https://mdash.net
Other
34 stars 17 forks source link

Stopping the loop() task execution when using mDashShadowUpdate() #22

Open Thermelgy-Repo opened 1 year ago

Thermelgy-Repo commented 1 year ago

Dear @cpq @novlean, Thanks for your great efforts.

My code was running perfectly with Mdash V1.2.14 & the esp-Arduino V1.0.6.

Recently, I have upgraded the esp-Arduino version from V1.0.6 to V2.0.9 & the mDash version from V1.2.14 to V1.2.16. Now, when I am using mDashShadowUpdate("{%Q:{%Q:{%Q:%Q}}}","state", "reported", "Key", "Success"); . It is stopping the loop() task & the Mdash goes offline & comes back. The Mdash is working such as FS & RPC, But the execution of loop() task is stopped.

Some Insights: I have checked the mDashNotify() function, saw MDashMutexLock() initialization. I have a doubt that may be the Mutex is stopping the loop() function or some other freeRTOS related problems.

Build Logs:

Processing esp32dev (platform: espressif32@6.3.1; board: esp32dev; framework: arduino)
-------------------------------------------------------------------------------------------------------------------------------------------
Verbose mode can be enabled via `-v, --verbose` option
CONFIGURATION: https://docs.platformio.org/page/boards/espressif32/esp32dev.html
PLATFORM: Espressif 32 (6.3.1) > Espressif ESP32 Dev Module
HARDWARE: ESP32 240MHz, 320KB RAM, 16MB Flash
DEBUG: Current (esp-prog) External (cmsis-dap, esp-bridge, esp-prog, iot-bus-jtag, jlink, minimodule, olimex-arm-usb-ocd, olimex-arm-usb-ocd-h, olimex-arm-usb-tiny-h, olimex-jtag-tiny, tumpa)
PACKAGES:
 - framework-arduinoespressif32 @ 3.20009.0 (2.0.9)
 - tool-esptoolpy @ 1.40501.0 (4.5.1)
 - toolchain-xtensa-esp32 @ 8.4.0+2021r2-patch5
LDF: Library Dependency Finder -> https://bit.ly/configure-pio-ldf
LDF Modes: Finder ~ chain, Compatibility ~ soft
Found 59 compatible libraries
Scanning dependencies...
Dependency Graph
|-- mDash @ 1.2.16+sha.8bd6647
|-- eModbus @ 1.3.0    

Platform.ini file:

[platformio]
default_envs = 

[env:esp32dev]
platform = espressif32@6.3.1
board = esp32dev
framework = arduino
board_upload.flash_size = 16MB
board_upload.maximum_size = 16777216
board_build.partitions = F:\Aspiration Energy\Heat Pump Monitoring\Hardware Development\Thermelgy tMY Future\Firmware\Development\Thermelgy Gen2 Gateway\Partition\Thermelgy_Custom_16MB.csv 
monitor_speed = 115200
monitor_filters = esp32_exception_decoder
build_flags =
    -Isdkconfig
    -DLOG_LEVEL=LOG_LEVEL_ERROR
    -DCORE_DEBUG_LEVEL=3
    -DARDUINOJSON_USE_DOUBLE=0
lib_deps = 
    https://github.com/cesanta/mDash.git#1.2.16
    eModbus@1.3.0

Kindly help me to solve the issue.

younesmaia commented 1 year ago

I can confirm the same problem is happening to me on mDash v1.2.16 and Platformio ESP32 core 6.3.1 (latest). Opened a ticket about it: https://github.com/cesanta/mDash/issues/21

The loop() task indeed stops when mDash functions are called. I'm having random issues even with mDashConfigGet.

Side note: If possible, bring back the CLI as well. Not having it breaks the device provisioning flow we had.

ssbrewco commented 1 year ago

Side note: If possible, bring back the CLI as well. Not having it breaks the device provisioning flow we had.

Issue #20 open on this one.

cpq commented 1 year ago

Gents, the code is wide open.

Sends your PRs.

younesmaia commented 1 year ago

I've temporarily commented lines 104 and 107 (xSemaphoreTakeRecursive and xSemaphoreGiveRecursive) for testing and mDash works perfectly (for as long as you don't call any mDash function while another is still running 😀). So, this is definetely an Mutex-related issue. But, I could not find anything wrong with the implementation in mDash.c itself. The calls for taking and giving the semaphore are properly placed and there should be no holdup...

Thermelgy-Repo commented 10 months ago

I have solved the issue, plz check the PR. https://github.com/cesanta/mDash/pull/24

cpq commented 10 months ago

You just added a duplicate of notify without the locks - that's not a proper fix.

I guess that the core of the issue is here: https://github.com/cesanta/mDash/blob/8bd664773c5b041ed582b59d446478cf7ef582c3/src/mDash.c#L736-L740

All functions that may be called by the mDashPoll(), for example mDashShadowUpdate(), will deadlock. So I guess that the proper fix would be to just remove the locks from mDashShadowUpdate() and document that the mutex locks must be called manually before and after, in case the function is called outside the mdash thread.

Thermelgy-Repo commented 10 months ago

You just added a duplicate of notify without the locks - that's not a proper fix.

I guess that the core of the issue is here:

https://github.com/cesanta/mDash/blob/8bd664773c5b041ed582b59d446478cf7ef582c3/src/mDash.c#L736-L740

All functions that may be called by the mDashPoll(), for example mDashShadowUpdate(), will deadlock. So I guess that the proper fix would be to just remove the locks from mDashShadowUpdate() and document that the mutex locks must be called manually before and after, in case the function is called outside the mdash thread.

Thankq @cpq for your suggestion, So you meant to directly remove the MutexLocks from the mDashNotify() as well as mDashPoll() functions?

cpq commented 10 months ago

@Thermelgy-Repo from the mDashNotify only. It would be a user responsibility to surround the mDashNotify call with locks manually if it gets called from the outside of Mongoose task.