espressif / esp-modbus

ESP-Modbus - the officially suppported library for Modbus protocol (serial RS485 + TCP over WiFi or Ethernet).
Apache License 2.0
85 stars 46 forks source link

Modbus TCP Master not closing sockets correctly? Unable to create socket: #-1, errno 23 (IDFGH-11539) #42

Closed fromeijn closed 7 months ago

fromeijn commented 8 months ago

On ESP-IDF V4.3.6 with current master of esp-modbus ea3f06c getting these problem when modbus connection is closed and openend again.

Log after a while:

E (649360) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649460) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649460) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649560) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649560) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649660) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649660) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649760) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649760) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649860) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649860) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
E (649960) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (649960) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.

it seems that the current implementation is not closing the sockets correctly and the device is running out on them after a while

I use mbc_master_destroy() to close master but it seems it is not calling xMBTCPPortMasterCloseConnection()

alisitsyn commented 8 months ago

it seems that the current implementation is not closing the sockets correctly and the device is running out on them after a while

I just quickly checked the related code and see that the master correctly closes all slave connections when destroy is called here. How often do you open and close Modbus in your application? Could you send more messages from the log to clarify what events happened prior the errors showed in your log above? Log with debug verbosity can clarify this.

I think I need to reproduce your situation and will do it when possible.

Thanks.

fromeijn commented 7 months ago

Thank you for your reply and patience! Im able now to replicate this problem quite well with the general example code and adding a loop trying to read, the IP to which the Modbus TCP master tries to connect to exists, but does not have a Modbus slave running.

        ESP_ERROR_CHECK(init_services(ip_addr_type));

    for (size_t i = 0; i < 20; i++) {
        ESP_LOGI(TAG, "Wait 10 seconds to start...");
        vTaskDelay(pdMS_TO_TICKS(10000));
        ESP_LOGI(TAG, "Starting modbus...");

        mb_communication_info_t comm_info = { 0 };
        comm_info.ip_port = MB_TCP_PORT;
        comm_info.ip_addr_type = ip_addr_type;
        comm_info.ip_mode = MB_MODE_TCP;
        comm_info.ip_addr = (void *)slave_ip_address_table;
        comm_info.ip_netif_ptr = (void *)get_example_netif();
        esp_err_t res = master_init(&comm_info);
        if (res != ESP_OK) {
            ESP_LOGE(TAG, "Modbus master initialization fail. %s", esp_err_to_name(res));
        } else {
            ESP_LOGI(TAG, "Modbus master initialization success.");
            vTaskDelay(50);
            master_operation_func(NULL);
        }

        ESP_ERROR_CHECK(master_destroy());
    }

i made one change to the code to prevent watchdog kicking in:

index f0d3367..8593018 100644
--- a/freemodbus/tcp_master/port/port_tcp_master.c
+++ b/freemodbus/tcp_master/port/port_tcp_master.c
@@ -754,6 +754,7 @@ static void vMBTCPPortMasterTask(void *pvParameters)
                     break;
                 }
                 putchar(ucDot);
+               vTaskDelay(pdMS_TO_TICKS(100)); /* if we don't yield we run the risk of hogging CPU */
                 xErr = xMBTCPPortMasterConnect(pxInfo);
                 switch(xErr)

and setting CONFIG_LWIP_MAX_SOCKETS=1 will make the issue more pronounced

log output:

I (6329) esp_netif_handlers: example_connect: sta ip: 10.10.0.183, mask: 255.255.255.0, gw: 10.10.0.1
I (6329) example_connect: Got IPv4 event: Interface "example_connect: sta" address: 10.10.0.183
I (6339) example_connect: Connected to example_connect: sta
I (6339) example_connect: - IPv4 address: 10.10.0.183
I (6349) wifi:Set ps type: 0

I (6349) MASTER_TEST: Leave IP(0) = [10.10.0.141] set manually.
I (6359) MASTER_TEST: IP(1) is not set in the table.
I (6369) MASTER_TEST: Configured 1 IP addresse(s).
I (6379) heap: internal heap -- total free bytes: 230912, minimum free bytes: 219132, percent used: 22.16%
I (6379) MASTER_TEST: Wait 10 seconds to start...
I (10399) wifi:<ba-add>idx:0 (ifx:0, c8:7f:54:7c:e7:d8), tid:0, ssn:1, winSize:64
I (16389) MASTER_TEST: Starting modbus...
I (16389) MASTER_TEST: Modbus master stack initialized...
I (16389) MB_TCP_MASTER_PORT: TCP master stack initialized.
I (16399) MB_TCP_MASTER_PORT: Host[IP]: "10.10.0.141"[10.10.0.141]
I (16399) MB_TCP_MASTER_PORT: Add slave IP: 10.10.0.141
I (16409) MB_TCP_MASTER_PORT: Connecting to slaves...
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-E (36399) MB_CONTROLLER_MASTER: mbc_tcp_master_start(180): mb stack could not connect to slaves for 20 seconds.
.E (36409) MB_CONTROLLER_MASTER: mbc_master_start(177): Master start failure, error=(0x103) (ESP_ERR_INVALID_STATE).
E (36429) MASTER_TEST: master_init(684): mb controller start fail, returns(0x103).
E (36429) MASTER_TEST: Modbus master initialization fail. ESP_ERR_INVALID_STATE
I (36509) MASTER_TEST: Modbus master stack destroy...
I (36509) heap: internal heap -- total free bytes: 225224, minimum free bytes: 216692, percent used: 24.06%
I (36519) MASTER_TEST: Wait 10 seconds to start...
I (46519) MASTER_TEST: Starting modbus...
I (46519) MASTER_TEST: Modbus master stack initialized...
I (46519) MB_TCP_MASTER_PORT: TCP master stack initialized.
I (46529) MB_TCP_MASTER_PORT: Host[IP]: "10.10.0.141"[10.10.0.141]
I (46529) MB_TCP_MASTER_PORT: Add slave IP: 10.10.0.141
I (46529) MB_TCP_MASTER_PORT: Connecting to slaves...
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.E (66529) MB_CONTROLLER_MASTER: mbc_tcp_master_start(180): mb stack could not connect to slaves for 20 seconds.
E (66539) MB_CONTROLLER_MASTER: mbc_master_start(177): Master start failure, error=(0x103) (ESP_ERR_INVALID_STATE).
-E (66549) MASTER_TEST: master_init(684): mb controller start fail, returns(0x103).
E (66549) MASTER_TEST: Modbus master initialization fail. ESP_ERR_INVALID_STATE
I (66649) MASTER_TEST: Modbus master stack destroy...
I (66649) heap: internal heap -- total free bytes: 224676, minimum free bytes: 216400, percent used: 24.24%
I (66669) MASTER_TEST: Wait 10 seconds to start...
I (76669) MASTER_TEST: Starting modbus...
I (76669) MASTER_TEST: Modbus master stack initialized...
I (76669) MB_TCP_MASTER_PORT: TCP master stack initialized.
I (76679) MB_TCP_MASTER_PORT: Host[IP]: "10.10.0.141"[10.10.0.141]
I (76679) MB_TCP_MASTER_PORT: Add slave IP: 10.10.0.141
I (76689) MB_TCP_MASTER_PORT: Connecting to slaves...
-E (76789) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (76789) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
.E (76899) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (76899) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
-E (77009) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (77009) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
.E (77119) MB_TCP_MASTER_PORT: Unable to create socket: #-1, errno 23
E (77119) MB_TCP_MASTER_PORT: Slave #0, Socket(#-1)(10.10.0.141), unexpected error = 0xfffffff4.
fromeijn commented 7 months ago

Probably found the culprit, vTaskDelete was called too early in xMBTCPPortMasterShutdown:

index f0d3367..7e2d2f7 100644
--- a/freemodbus/tcp_master/port/port_tcp_master.c
+++ b/freemodbus/tcp_master/port/port_tcp_master.c
@@ -236,7 +236,6 @@ static BOOL xMBTCPPortMasterCloseConnection(MbSlaveInfo_t *pxInfo)
 static void xMBTCPPortMasterShutdown(void)
 {
     xSemaphoreGive(xShutdownSema);
-    vTaskDelete(NULL);
     xMbPortConfig.xMbTcpTaskHandle = NULL;

     for (USHORT ucCnt = 0; ucCnt < MB_TCP_PORT_MAX_CONN; ucCnt++) {
@@ -251,6 +250,7 @@ static void xMBTCPPortMasterShutdown(void)
         }
     }
     free(xMbPortConfig.pxMbSlaveInfo);
+    vTaskDelete(NULL);
 }

 void vMBTCPPortMasterSetNetOpt(void *pvNetIf, eMBPortIpVer xIpVersion, eMBPortProto xProto)
@@ -754,6 +754,7 @@ static void vMBTCPPortMasterTask(void *pvParameters)
                     break;
                 }
                 putchar(ucDot);
+               vTaskDelay(pdMS_TO_TICKS(100)); /* if we don't yield we run the risk of hogging CPU */
                 xErr = xMBTCPPortMasterConnect(pxInfo);
                 switch(xErr)
                 {

It seems to be fixed now, I'm currently running some tests if everything else is still working

fromeijn commented 7 months ago

44

alisitsyn commented 7 months ago

@fromeijn ,

Thank you for your investigation and update. The close sockets were previously performed in vMBMasterTCPPortDisable() and in last update the code was moved to shutdown function but with incorrect order due to copy & paste or incorrect merge the deletion of task is performed on start of the function and the sockets are never closed then.

alisitsyn commented 7 months ago

Merged with commit 135a8f8165a701bda027f87a1e336723e397cbb2