Closed MichaelUray closed 1 year ago
but also mb_master_main() together with some other tasks
I have to revise that, during my further tests I was only able to see this error when MQTT and Modbus run at the same time, but not together with any of my other tasks. So I think the issue is a combination of MQTT and Modbus communication at the same time in two separate tasks.
What could cause that and how to isolate the issue?
I am not quite sure which example you are talking about. Is it Modbus server or Modbus client. Is it serial or network?
Sorry, I actually did not describe it exactly, I meant the Modbus serial RTU client and I used the ESP IDF MQTT SSL example which I modified a little bit.
Can you please show the tasks' code?
I have attached the full project there.
The relevant file ist mb_master.cpp
and mqtt_ssl.c
.
GK_PV.zip
For now it does not seem to me that the problem is with the library itself. Since you are using many tasks with a lot of memory reserved for them it might be anything from wrong task priorities to memory corruption (according to errors that you get). Please do some more debugging and write if you are sure that the problem is directly connected to the library functionality.
Since you are using many tasks with a lot of memory reserved
I did increase the memory a lot just to make sure to not to run into some stack issues like I already had.
According to the task list I don't need that much and everything was running before with less stack as well.
count_task X 9 2872 15
IDLE R 0 1104 5
IDLE R 0 1112 6
mb_master_main B 9 6000 16
main B 1 1688 4
nvs_main B 9 4392 12
tiT B 18 2824 8
server_handle_t B 6 6384 14
ipc1 B 24 1060 2
uart_queue_task B 10 3616 19
ipc0 B 24 1060 1
modbus_slave_ta B 9 3548 18
mb_slave_main B 9 5972 17
Tmr Svc B 1 1568 7
sys_evt B 20 1924 9
wifi B 23 4664 10
esp_timer S 22 3392 3
ws_server_task B 5 5552 11
server_task B 9 4312 13
It also shows me plenty of RAM left, for that reason I was not worried to assigned too much RAM.
I (10104) count_task: RAM left 161144
Please do some more debugging
I am pretty much a beginner with FreeRTOS as well as C++, but I have some C programming experience. Any recommendation where to start off with debugging?
Do you actually use interrupts in your communication which could led to a Interrupt wdt timeout
?
anything from wrong task priorities
Could it be an issue to run all tasks in the same priority? From my FreeRTOS understanding it should not matter which priority a task has. I have already tried to change some of them, but it did not help.
You have two global variables named "client" (in mb_master.cpp and in mqtt_ssl.c). Make them local or rename them or do something else to make sure that they do not interfere with each other.
Oh boy, you are absolutely right, this was causing the problem. I was looking for days for this issue, but I did not figure it out, thank you very much for your help!
I understand now, that I have to declare it as static, if I want to limit the scope to the file. But why doesn't the compiler throw an error, if I declare the same variable in two files with different data types?
I don't know why there is no error. I don't have much experience in mixing C and C++ in one program, but maybe it has something to do with this.
It looks as if the behavior is undefined, if a declaration is done like that withour static
in each (or probably at least one) file.
https://stackoverflow.com/questions/74412226/how-throw-a-gcc-error-if-a-global-variable-with-the-same-name-gets-declared-twi
The solution is to add -fno-common
as a compiler option, which was not default prior to GCC 10.
This would raise an error from the compiler in such a situation.
The current ESP IDF Version 4.4.2 / toolchain uses GCC 8.4 which has this option not as default.
Add this option to CMakeLists.txt:
idf_build_set_property(COMPILE_OPTIONS "-fno-common" APPEND)
When I run the example task as single application, then it works fine, but when I start it together with other tasks (e.g. MQTT) then it resets with the following message.
Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0)
Sometimes it also reports the following message before a reboot:
Guru Meditation Error: Core 0 panic'ed (InstructionFetchError). Exception was unhandled.
I have tried to start four just very simple tasks, but then it does not happen, for example like this:
But if I start more complex tasks like this, then this problem occurs.
Even mqtt_ssl_main() and mb_master_main() causes this problem, but also mb_master_main() together with some other tasks.
What could casue this problem?