espressif / esp-modbus

ESP-Modbus - the officially suppported library for Modbus protocol (serial RS485 + TCP over WiFi or Ethernet).
Apache License 2.0
85 stars 46 forks source link

(BUG) modbus master crashes on slave lost connection and reconnects then (IDFGH-11778) #47

Closed Lagunaxx closed 3 months ago

Lagunaxx commented 7 months ago

Hi. Have project with crashes modbas master 2 devices on esp32 (1st) tcp_Master and (2nd) tcp_slave. (1st) works in WiFi AP mode, (2nd) - STA 1) : Switch (1) on, wait till boot Switch (2) on - it connected normaly as result - modbus works well 2): (2) -Swich phisical power off wait any time or not wait <- here mast be on (1) modbus master at least one try to read data from (2) modbus slave __ (2) Switch power on as result: a) (2) booted, b) (2)tries to connect to (1)-AP, c) (1) AP disconnects it (old lost connection) (in this case we run modbus destroy), d) (2) tries to reconnect e) (1) accepts connection f) (2) connected? <-- (1) crashes somewhere here.


I (208579) MB_TCP_MASTER_PORT: Connecting to slaves...
-D (208586) MB_TCP_MASTER_PORT: Slave #0, Socket(#56)(192.168.4.2), connected 1 slave(s), error = 0x0.
I (208594) MB_TCP_MASTER_PORT: Connected 1 slaves, start polling...
D (214042) MB_PORT_COMMON: xMBMasterRunResTake:Take MB resource (5000 ticks).
D (214043) MB_PORT_COMMON: xMBMasterRunResTake:Take MB resource (5000 ticks).
D (214046) MB_PORT_COMMON: 213164006:EV_MASTER_FRAME_TRANSMIT
D (214052) POLL transmit buffer: 02 00 00 00 01 
D (214060) MB_TCP_MASTER_PORT: Slave #0, Socket(#56)(192.168.4.2), send data successful: TID=0x00, 12 (bytes), errno 0
D (214067) MB_PORT_COMMON: vMBMasterPortTimersRespondTimeoutEnable Respond enable timeout.
D (214075) MB_PORT_COMMON: 213164006:EV_MASTER_FRAME_SENT
D (214076) MB_TCP_MASTER_PORT: FSM Synchronized with sent event.
D (214082) POLL sent buffer: 02 00 00 00 01 
D (214087) MB_TCP_MASTER_PORT: Set select timeout, left time: 2980 ms.
D (215979) event: running post WIFI_EVENT:15 with handler 0x400d9a20 and context 0x3ffd7354 on loop 0x3ffd342c
0x400d9a20: System::system_handler(void*, char const*, long, void*) at /home/esp32/eclipse-workspace/BaseScreen/main/System.cpp:1135

D (215982) wifi:ap recv assoc/reassoc request
D (215983) wifi:Assoc req from a client already connected with PMF. Check if it is an attack.
I (215992) wifi:starting SA query procedure with STA(a4:cf:12:5e:94:7c)
I (215998) wifi:Send SA Query req with transaction id 5456
D (216105) MB_PORT_COMMON: TCP Master port disable.
I (216198) wifi:Send SA Query req with transaction id 5457
I (216398) wifi:Send SA Query req with transaction id 5458
I (216598) wifi:Send SA Query req with transaction id 5459
I (216798) wifi:Send SA Query req with transaction id 545a
I (216998) wifi:Send SA Query req with transaction id 545b
I (217022) wifi:STA not responded to 6 SA Query attempts, Reset connection sending disassoc
I (217025) wifi:station: a4:cf:12:5e:94:7c leave, AID = 1, bss_flags is 658547, bss:0x3ffe2ce8
I (217028) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<255,255>, prof:1
I (217035) wifi:<ba-del>idx:2, tid:0
D (217075) MBM_TIMER: Timer mode: (1) triggered
D (217075) MB_PORT_COMMON: 213164006:EV_MASTER_ERROR_PROCESS
D (217076) MB_PORT_COMMON: vMBMasterErrorCBRespondTimeout:Callback respond timeout.
D (217080) MB_TCP_MASTER_PORT: Select timeout, left time: 0 ms.
D (217082) MB_PORT_COMMON: Transaction (213164006), processing time(us) = 3029657
D (217089) MB_TCP_MASTER_PORT: Shutdown stack from vMBTCPPortMasterTask(837)
D (217076) MB_PD (217104) MB_PORT_COMMON: vMBPortSetMode: Port eOnter critical.
D (217110) MB_PORT_COMMON: vMBPortSetMode: Port exit critical
RT_COMMON: eMBMasterWaitRequestFinish: returned event = 0x100

assert failed: xEventGroupSetBits event_groups.c:559 (xEventGroup)

Backtrace: 0x40081b06:0x3ffd6c40 0x40089ee1:0x3ffd6c60 0x40091155:0x3ffd6c80 0x4008d365:0x3ffd6da0 0x400eaa1a:0x3ffd6dc0 0x400eaf99:0x3ffd6df0 0x400e7ec1:0x3ffd6e20 0x400e81f4:0x3ffd6e50 0x400da4dc:0x3ffd6ec0 0x400d96d7:0x3ffd6ef0 0x400e5faa:0x3ffd7030 0x4008c92e:0x3ffd7060
0x40081b06: panic_abort at /home/esp32/esp-idf/components/esp_system/panic.c:452

0x40089ee1: esp_system_abort at /home/esp32/esp-idf/components/esp_system/port/esp_system_chip.c:84

0x40091155: __assert_func at /home/esp32/esp-idf/components/newlib/assert.c:81

0x4008d365: xEventGroupSetBits at /home/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/event_groups.c:559 (discriminator 1)

0x400eaa1a: eMBMasterWaitRequestFinish at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32Modbus/freemodbus/port/portevent_m.c:273 (discriminator 15)

0x400eaf99: eMBMasterReqReadDiscreteInputs at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32Modbus/freemodbus/modbus/functions/mbfuncdisc_m.c:96

0x400e7ec1: mbc_tcp_master_send_request at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32Modbus/freemodbus/tcp_master/modbus_controller/mbc_tcp_master.c:294

0x400e81f4: mbc_tcp_master_get_parameter at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32Modbus/freemodbus/tcp_master/modbus_controller/mbc_tcp_master.c:497

0x400da4dc: mbc_master_get_parameter at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32Modbus/freemodbus/common/esp_modbus_master.c:72 (discriminator 2)

0x400d96d7: System::handler(Device::Hardware::t_Data*) at /home/esp32/eclipse-workspace/BaseScreen/main/System.cpp:824

0x400e5faa: Device::Hardware::class_IO::IO_loop(void*) at /home/esp32/eclipse-workspace/BaseScreen/components/ESP32_System/Extensions/IO.cpp:206

0x4008c92e: vPortTaskWrapper at /home/esp32/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:162
alisitsyn commented 6 months ago

Hi @Lagunaxx ,

Thank you for the issue and sorry for the late response.

Update: I tried to reproduce the issue you described as per log below but did not get the same result.

log.txt

c) (1) AP disconnects it (old lost connection) (in this case we run modbus destroy),

In my code I do not destroy the stack on disconnection and it is able to reconnect and restart polling cycle again once the STA (Modbus slave) is connected.

I don't have your code but once you destroyed stack on disconnect where you initialize the stack after connection?

As per your log you successfully destroyed the stack on disconnect event then your polling task is trying to read discrete inputs but the event group handle is not actual and this causes the crash because the stack interface handle is not actual including all subsequent object handles. You need to change your algorithm to remove stack destroy on disconnection or handle disconnection and reconnection correctly including stack initialization, stop and restart of polling task before/after disconnection.

alisitsyn commented 3 months ago

@Lagunaxx ,

Do you have an update for the issue?

alisitsyn commented 3 months ago

@Lagunaxx, the issue will be closed because lack of feedback. Feel free to reopen if you have some results of test.