espressif / esp-modbus

ESP-Modbus - the officially suppported library for Modbus protocol (serial RS485 + TCP over WiFi or Ethernet).
Apache License 2.0
85 stars 46 forks source link

Modbus master communication random interruption causes a shift on the read registers (IDFGH-9133) #16

Closed ZanellaSimone closed 1 year ago

ZanellaSimone commented 1 year ago

Expected Behavior

The application must read 90 registers every second.

Problem Description

With version 4.2.0 of the framework-espidf, the freemodbus library has always been used. After random modbus communication interruption tests, I noticed that the values of the read registers has been shifted in the next register (value read on register one shifted into register two and so on). After reading some issues related to modbus master problems, I view the patch from #https://github.com/espressif/esp-idf/issues/5986 and I try to change the following settings:

Even after applying this patch the problem is not solved. So I decided to exclude the freemodbus library (add to project CMakeLists.txt set(EXCLUDE_COMPONENTS freemodbus) instruction) from my project and to include the esp-modbus version 1.0.8 (downloaded from the official repository on GitHub). The problem is still present and the update to the latest version of the modbus communication management library has not brought any effect.

I need to receive a solution from us as soon as possible, as some of our customers have reported this problem and I have been forced to stop production

Debug Logs

debug.txt

Here you can find a log of the application. 9 registers are read (they are register with fixed values). The communication is then interrupted. When reading operation resumes, the register values ​​do not match with values read before.

Other useful item

sdkconfig.txt

alisitsyn commented 1 year ago

Hello @ZanellaSimone,

The Modbus RTU protocol can not guarantee the critical data delivery especially if the communication is interrupted. The problem with shifting register data is known and can not be solved completely on Modbus RTU layer. The Modbus RTU protocol response frame does not contain transaction ID field nor the start register offset and protocol stack can not check that the response from slave corresponds to its current request. If the response frame contains the same command and length field and crc16 is correct in the frame the Modbus protocol can not recognize if the response is for current request or for previous one. If your application needs to delivery data reliably using Modbus RTU or ASCII protocol you need to protect the critical data structure on application side using the checksum (CRC16 or CRC32 is recommended). modbus communication 1

The approach to deliver the critical data can be described as below:

  1. The holding register structure with the data needs to include the crc field.
  2. Slave registers the register area for this whole structure and initializes the Modbus stack.
  3. Once the slave updates any field in the critical structure it then calculates the CRC of whole data structure except the crc field and updates the crc field in the structure with calculated CRC. The update of the structure should be performed as atomic operation.
  4. The Master registers the CID1 in its object dictionary that corresponds to the critical data structure the same as on Slave device.
  5. Master sends the request for this CID1 to Slave and it returns the response with the whole critical data structure. The returned data is saved in the data structure by Modbus.
  6. Master calculates the CRC for all bytes of data structure and then checks if it equals to the crc field in the critical data structure received from the slave.
  7. If the calculated CRC is correct the master can use the data in the critical data structure, otherwise it clears the UART receiver FIFO (as an example can call the vMBMasterRxFlush( void ) function) and then repeats the request for the CID1 and ignores the incorrect data.

Other possible approach:

  1. Once the communication error happens due to communication or disconnection.
  2. Wait for maximum slave response time.
  3. Clear the UART queue and FIFO.
  4. Retry last response again.

Let me know if you need more information.

alisitsyn commented 1 year ago

Hello,

Could you provide some update on the issue? The solution above when you send the whole structure instead of independent register would solve your issue with shifting registers. Please let me know if you need more information.

ZanellaSimone commented 1 year ago

Hello @alisitsyn , thanks for the explanation. The situation I saw that it can be improved by increasing the priority of the serial task and in particular of the CONFIG_FMB_PORT_TASK_PRIO configuration parameter (from the default value I brought it to the highest priority). Changing this parameter I see that the system becomes more robust and the problem occurs with a lower frequency. To be sure then to intercept the problem of the shift of the registers, after each reading cycle I go to perform a check on some of them (that in my application maintain a fixed value). In this way if I see that their value is different from what I expect I perform a re-initialization of the modbus.

alisitsyn commented 1 year ago

Hello ZanellaSimone,

Thank you for the update. Could you provide more information about your project and environment? This will help to give you more relevant advises.

  1. What HW communication interface you are using for Modbus (RS485)?

  2. Do you have other services and components in your application?

  3. Do you have any devices in the Modbus segment that can send data without request except one Modbus master? If yes, what is the protocol supported?

ZanellaSimone commented 1 year ago

Hello @alisitsyn , as I explained in my last comment, my idea is to implement a reinitialization of the modbus communication when I intercept the register shift problem. The sequence I used for the reinitialization of modbus is the following:

If I use this simple sequence of instructions, I see that when my modbus communication starts working again, the register shift problem appears from the first reading (and it hasn't resolved itself as I would have expected following a new initialization). Any idea for re-initialize correctly the modbus communication?

alisitsyn commented 1 year ago

Hello @ZanellaSimone ,

Thank you for information. In order to understand the reason for this issue I need:

  1. answers to the questions here:.
  2. serial communication log would be helpful to understand what is happening on the communication bus.
  3. Set the log verbosity for your project: idf.py menuconfig; then in the menu Component config → Log output → Default log verbosity select - (X)Debug. Save config, then recompile the code. Start the project with idf.py flash monitor and reproduce issue. Send the log here.

I think you do not need to restart the Modbus stack. How to repair from data shifting issue:

  1. Read the structure of registers from slave instead of individual registers in the master project. The data of the structure will be transferred in one response from slave and be protected by CRC16 checksum of Modbus.
  2. if the read method returns the read error, wait timeout and then call the vMBMasterRxFlush() function (declare it with extern void vMBMasterRxFlush( void );) to reset the UART FIFO then restart from step 1.
  3. Check the integrity of data. Your simple method related to check for known register value can be used.
  4. if the data is incorrect repeat from step 1. to send the request for the data structure (i.e. mbc_master_get_parameter(CID_FOR_YOUR_STRUCTURE, .....);

Do you need the example of this?

ZanellaSimone commented 1 year ago

Hi @alisitsyn, can you send me an example of using the above procedure? Thanks

alisitsyn commented 1 year ago

Hi @ZanellaSimone,

No problem, I will try to prepare some example for you ASAP but I still need your answers and logs to know more about your environment.

Thanks.

ZanellaSimone commented 1 year ago

Hi @alisitsyn, thanks for your help. I updated my modbus esp-modbus library to version 1.0.8 and following the procedure you indicated above. Now everything works correctly.

alisitsyn commented 1 year ago

@ZanellaSimone,

Thank you for update. Please take a look to example here prepared for you.

Mr-Techtron commented 1 year ago

Any update on this? since I am facing the same issue. I am using the modbus protocol for polling the UPS parameters (three input/output phase voltage, battery status, etc). As @alisitsyn mentioned, I used the "Read the structure of registers from slave " this method instead of individual registers, but still no luck. I used the Modbus Slave software, where I intentionally increased the response delay, during which the esp32 failed to get the UPS parameters. Then I decreased the response delay, at this time, the esp32 failed to get the parameters giving me ESP_ERR_INVALID_RESPONSE error. I am using v4.4.4 tag.

alisitsyn commented 1 year ago

@Mr-Techtron ,

Do you use the esp-modbus component or the modbus component from v4.4.4? To be able to check the issue we need to use the same version of the component. Please follow the steps.

  1. What is the interface which is used to communication with your slave? Please share Modbus mapping table or manual of your UPS.
  2. Please provide your log of modbus master application with your errors during communication. I need the log from idf.py monitor and the serial communication log of master with slave (serial line).
  3. In order to check the issue please create the components folder in root of your project folder and then clone the esp-modbus component there proj_root/components/esp-modbus.
  4. Add the line below to your in the project’s CMakeLists.txt: set(EXCLUDE_COMPONENTS freemodbus)
  5. Apply the patch below to your esp-modbus folder. fix_message_processing_aftertout.patch.log
  6. Use the commands idf.py fullclean then `idf.py build' then flash and check the communication log again.
  7. Publish your project with the components folder in some location.
  8. Store your communication log in the location mentioned above.

Thank you.

See this post for more information.

Mr-Techtron commented 1 year ago

Hi @alisitsyn, thank you for your solution. Now it is working as expected, and I don't have to reinitialized the modbus stack anymore.

alisitsyn commented 1 year ago

Hello @ZanellaSimone , @Mr-Techtron,

The mentioned issues fixed in the v1.0.10. I would appriciate your feedback with the results of testing. Thank you.

alisitsyn commented 1 year ago

Hello @ZanellaSimone , @Mr-Techtron,

The issue has been fixed and is closed. Feel free to reopen if it still exists in your project after update to v1.0.11.