alejoseb / Modbus-STM32-HAL-FreeRTOS

Modbus TCP and RTU, Master and Slave for STM32 using Cube HAL and FreeRTOS
GNU Lesser General Public License v2.1
539 stars 183 forks source link

Extra zero byte "received" from RTU corrupts checksum #21

Closed brynwolfe closed 3 years ago

brynwolfe commented 3 years ago

I'm sending a modbus message through this library in master mode at 9600 8N1. The modbus is set up with the following parameters:

 m_handler.uModbusType = MB_MASTER;
  m_handler.xTypeHW = USART_HW;
  m_handler.port = &huart4;
  // Master ID must be 0.
  m_handler.u8id = 0;
  m_handler.u16timeOut = 1000;
  // RS485 enable pin (UART_DE/UART_RTS).
  m_handler.EN_Port = EE_485_RDis_TEn_GPIO_Port;
  m_handler.EN_Pin = EE_485_RDis_TEn_Pin;
  m_handler.u32timeOut = 0;
  m_handler.u16regs = m_data;
  m_handler.u16regsize = sizeof(m_data) / sizeof(m_data[0]);

The device on the other end acknowledges the message as shown in the attached image. Unfortunately, the RX data buffer shows the data with an extra zero (0) byte at the beginning of the message so it is 9 bytes long instead of 8. It appears to be reading the one character delay between transmit and receive as another byte of transmission.

Unfortunately, this causes the validateAnswer function called from StartTaskModbusMaster to fail on the checksum calculation. Is there a modbus setup parameter that might cause this extra initial zero byte? What am I missing?

Modbus oscope

alejoseb commented 3 years ago

Hi, According to the oscilloscope image that you attached, the slave that you are talking to is not respecting the Modbus protocol timing. A Modbus compliant slave device must wait 3.5 characters after the last byte received from the master before responding to it. In your case the slave is waiting 1 character (i.e. the spurious 0 in the red rectangle) and then is answering immediately. Therefore, the master considers the zero and the rest of the frame as a single response coming from the slave. To solve this issue you should check the code of your slave and add the corresponding delays there, or reduce the T35 parameter in the master modbusconfig.h. The later solution will also make your master not compliant.

brynwolfe commented 3 years ago

Ah, I see that now. I'm talking to a Waveshare Modbus RTU Relay device, so I have contacted the vendor for an explanation. Thanks

alejoseb commented 3 years ago

No problem, close the issue when you solve it and do not forget to add a start to this repo if you find it useful :smiley:!

brynwolfe commented 3 years ago

Incidentally, if the message comes in too early, shouldn't the modbus logic raise an error instead of parsing an invalid response? I'm just confused about how if the response came early that it's being considered consumable data.

alejoseb commented 3 years ago

That could be an extra control that can be implemented, but that will not solve the issue of your non-compliant slave because if the control is implemented the first 2.5 characters of a valid response of your slave will be ignored. Currently the library parses anything that comes to the serial port and it looks for an approximately 3.5 character-length silence between characters (not a zero) to separate frames. Your oscilloscope clearly shows a small glitch ( after the 3A character) in the yellow line which is decoded as an spurious zero. That zero could be due to poor RS485 transceivers and electrical connections. The following image shows how the library works with 2 timing-compliant devices on my side without any issue and no spurious zero at 9600 bauds. I used a T35 equal to 3. I am using MAX485 transceivers.

modbus

brynwolfe commented 3 years ago

I tried this with a device for which the response back IS modbus compliant and I still see the zero byte as the first entry in the u8Buffer, which still results in a ERR_BAD_CRC condition. Is there any other reason why this zero byte is showing up in the buffer? The transmission is 1Mbps 8E1. PXL_20210705_210855076

alejoseb commented 3 years ago

Probably the zero is due to your transceivers and electrical connections. Besides for the speed that you are using the 3.5 character time is fixed to 1.75 ms which I think is correct in your oscilloscope. You need to use a T35 equal to 1 or 2, that could help to manage a spurious zero.

brynwolfe commented 3 years ago

Actually the delay between the request and response on the scope is 170us, not 1700us. Not clear how your modbus library handles early responses but they are getting through to my higher level code usually.

On another note, I noticed that the sendTxBuffer code driving of the rs485 transmit enable pin has no affect on the actual pin. I commented out both HAL_GPIO_WritePin calls and it didn't affect transmission. That signal is driven by the HAL_UART_Transmit_IT call.

alejoseb commented 3 years ago

If that is 170us then that device is not following modbus recommendation.

image

I said that the library does not manage early answers from slaves explicitly. The library eventually will not be affected if the zero arrives early but there is still a silence long enough to separate the frames. Then the library will treat the zero as an standalone frame, check it's size and discard it because is too short. Then the following bytes will arrive and the library will do the same thing but this time it will be the correct frame length and content. Since your devices are not modbus compliant then the library has no chance to manage the zero and separate frames. A workaround could be to disable RX interrupts right after sending the master request, do a delay enough to ignore the zero, clean the ring buffers and enable again interrupts. Since that workaround is not modbus compliant I will not implement it, but you can test it and see if that works.

Regarding your side note, that is incorrect. If you are observing that the pin is activated after commenting the code, then I guess you configured your usart to manage it automatically enabling the rs485 mode or another type of control flow. My library does not use those modes, even tough is possible, because not all STM32 MCUs support that. So to keep it simple and general I assume the serial port is configured for not using any hardware-controlled control flow or Rs485 mode.

brynwolfe commented 3 years ago

Thanks for the correction regarding the pin activation. I in fact do have the RS485 hardware flow control flag enabled through the STM32CubeMX GUI, so I just changed my code to set the EN_Pin and EN_Port to NULL so the code doesn't try to drive that signal.

I realized the problem I'm having with the modbus library is related to the 1Mbps data rate. This generates bursts of interrupts that are only 11 microseconds apart which I think are occasionally being preempted by higher priority interrupts, losing a byte and thus corrupting the modbus frame. My code recovers from this condition so I have a temporary workaround.

Of course, the ultimate solution is to use DMA with idle line detection enabled so there's only one interrupt at the end of each receive frame. There's a good article about it in the following link.

https://stm32f4-discovery.net/2017/07/stm32-tutorial-efficiently-receive-uart-data-using-dma/

I really appreciate the time you've put in to assist me. Thanks.

alejoseb commented 3 years ago

Yes, for those speeds that kind of issues might appear. I already have an implementation using DMA and the idle line detection. Unfortunately, that makes the library less general and compatible with fewer MCUs. I will integrate it here in the near future keep tuned.

alejoseb commented 3 years ago

Check the latest commit and examples, the library now supports DMA with Idle line detection. I tested it up to 2 Mbps without a single error.

brynwolfe commented 3 years ago

I tried the new DMA version but HAL_UART_ErrorCallback keeps getting called and HAL_UARTEx_RxEventCallback is never called.

brynwolfe commented 3 years ago

It appears to be a frame error based on the huart->ErrorCode = 4 (HAL_UART_ERROR_FE). I believe I understand what is going on. The RS485 transceiver is half duplex, so as is typical with these transceivers (in my case, an ISL83485), the receive enable (REn) and data/transmit enable (DE) pins are tied together so you can only be receiving or transmitting. For the ISL83485, when REn is driven high, meaning in transmit mode, it drives the digital output of its receiver (RO) low. The STM32 UART sees this transition as a valid start of a frame but then detects a frame error because the stop bit isn't where it should be. To be clear, this happens during the transmit phase. Even if I disconnect the remote device so there is no response, the ErrorCallback is invoked. Ultimately though, I think this was the cause of the original post's "extra zero byte".

alejoseb commented 3 years ago

I said: "That zero could be due to poor RS485 transceivers and electrical connections". I use a MAX485 with 120 ohms resistors as per datasheet recommendation (page 13) https://datasheets.maximintegrated.com/en/ds/MAX1487-MAX491.pdf . That zero should never appear even in half duplex operation. You must read datasheets of your ISL83485 and use the recommended components, or use different transceivers.