Open lanwer50 opened 2 months ago
CC @nordic-krch
I will be commenting on this issue soon. I am currently in the process of understanding how the driver works underneath and I might get near the root cause.
I am getting near cause, but still not completely solved.
I came across the documentation of the function uart_fifo_read() which states: "That means that once uart_irq_rx_ready() is detected, uart_fifo_read() must be called until it reads all available data in the FIFO (i.e. until it returns less data than was requested)."
This is however not the case in the implemented uart callback cb_handler_rx, where uart_fifo_read is called only once. A part of the solution would be to call this function as stated in the documentation until its returns <= 0. --> this leads to the following:
while(1) {
n = uart_fifo_read(cfg->dev, cfg->uart_buf_ptr,
(CONFIG_MODBUS_BUFFER_SIZE -
cfg->uart_buf_ctr));
cfg->uart_buf_ptr += n;
cfg->uart_buf_ctr += n;
if (n <= 0) {
break;
}
}
I am still looking for further fixes/optimizations to tackle this problem and looking forward to any input. Once completely solved, I will write my first merge request with this bugfix.
A further improvement leading to the problem happening less frequently is to change the order of the functions called in the uart callback cb_handler_rx in RTU mode:
first call the uart_fifo_read() function as mentioned in my last comment (in a while loop, not once). Then, start the rtu_timer and not the other way around. This is proven to cause less uart buffer overruns. This also does make sense: this way, we are able to read the data as fast as possible and then do the less critical operation (starting the timer).
However, as I mentioned, the problem still occurs, in a less frequency.
Any other thoughts/inputs are much appreciated.
Not sure it's relevant, but do you use HW RX byte counting? If not, can you try? This DevZone answer here tells about the Kconfig options to enable.
Not sure it's relevant, but do you use HW RX byte counting? If not, can you try? This DevZone answer here tells about the Kconfig options to enable.
-> This is indeed part of the solution puzzle.
The final solution for the problem above is using the uarte ASYNC Api along with the hardware timer for counting bytes.
Using Async Api without the hardware timer for counting the bytes is not enough for a stable communication.
Using the default api (interrupt driven) for modbus serial is unfortunately not adequate for high baudrates.
Notes:
This however means rewriting the modbus_serial.c file on my own to make use of the async api (per default only the interrupt driven is implemented). I will try to make my changes public in the next few weeks so that anybody can use serial modbus with UART async api, which is a must for applications with high baudrates and no hardware control.
This solution also forced me to re-implement my own application that was making use of all timers and ppi channels in order to leave a free timer and ppi channel for the hardware counting of bytes.
Another problem I still have to fix though (unrelated to this problem specifically) is described in this ticket: Choose the UARTE API (Async/interrupt driven) during runtime
Notes I simplified the modbus RTU samples (server and client) to a few lines of code (see below) and I am running two nRF52840 development boards: one as server and one as client with serial rx and tx wires connected crosswise.
The difference to the default samples is mainly increasing the baud rate and deleting the irrelevant coil read/write code.
Describe the bug The problem I am experiencing is that some received bytes are missing, which leads to crc mismatch ( modbus_serial: Calculated CRC does not match received CRC). This however happens only when using a relatively high baud rate: 460800. With the default baud rate from the sample (19200) or even with 115200 I was not able to see this problem.
The problem occurs almost all the time, especially with longer packets, but does also appear sometimes with smaller ones. In the edited samples attached, I am experiencing with 64 number of registers (128 bytes to be read). I am guessing it is a timing issue that interrupts are not being serviced as quickly as needed with this baud rate.
This is not a hardware issue: I attached the serial lines to a logic decoder and I was able to see all the rx bytes as expected within the expected time.
To Reproduce Steps to reproduce the behavior:
Expected behavior Receive complete data (without bytes missing) and so without crc mismatch.
Impact Showstopper for anyone who wants to use a high baud rate.
Environment (please complete the following information):
Additional context pin setting is defined in the overlay file attched.
missing_bytes_minimal_examples.zip