Closed abascom closed 2 years ago
Hi @abascom, thanks for reaching us and reporting the issue here.
After discussed with developer, I got the feedback as below:
The developer appreciate if you could help test the suggestion and contribute via PR back to us.
We can certainly give that a try.
With our CAN bus utilization, we estimate we could fill the RX FIFO in ~1ms. Does the developer have any data around how long the busy waits in the 6032 errata workaround could take in the worst case?
@abascom, yes, please have a try. For your question related to the busy waits, I will provide the feedback from developer later, as he is currently on Holiday leave. Appreciate for your patience.
I have tested the suggested change it is has been working well (I've left it running for a couple of days with regular CAN traffic).
I do have another question around the errata workaround. "
I have tested changing step 1 and 5 to disable/enable all interrupts instead of just flexcan interrupts to get rid of context switching. However even doing that, after enabling interrupts in step 5, the OS could context switch and steps 6, 7, and 8 could happen some time much later. Is this fine? Or should the enable all interrupts part actually be moved to the last step?
I am also still curious on the busy wait worst case timing question as soon as the developer is back and able to help answer.
Thanks for your help!
hi @abascom, I'm the SDK flexcan driver developer. For the first question, base on my understanding of the problem, the worst case wait time before sending each frame is 2 frame bits of time (if CAN bitrate is 500kbps, the wait time is 4us).
For the second question, it is no need to move enable all interrupts to the last step. For this errata, there is no issue if the CPU guarantees that any MB configured for transmission will not be aborted or deactivated (write 0x0 into CS Code field is an abort operation) in a short interval. If we do not turn off interrupts before doing step 4, we cannot guarantee that this MB has not just been configured, however, once we have performed step 4, neither reconfiguring the MB nor aborting it again will cause an error to be generated, so there is no need to disable the interrupt.
I've entered pull request (https://github.com/NXPmicro/mcux-sdk/pull/72) for this issue.
Is your enhancement proposal related to a problem? Please describe. We were updating our project from SDK version 2.4.2 to 2.10.0 when we observed intermittent RX FIFO overflows on FLEXCAN. After much debugging, we were able to narrow it down to the chip errata 6032 workaround function causing this issue. We run a multithreaded RTOS, and whenever a thread wanted to send a frame on CAN, it could disable all FLEXCAN interrupts, then context switch to another task leaving FLEXCAN interrupts disabled, eventually causing the 6 frame RX FIFO to overflow.
Describe the solution you'd like Either an option added to disable the 6032 errata workaround, or to have it implemented in such a way that interrupts cannot be disabled long enough to overflow the RX FIFO. I am willing to contribute a pull request for one of these options if you can provide some guidance on which approach is preferred.
Describe alternatives you've considered We have locally modified the SDK to disable the workaround and confirmed we no longer get intermittent RX FIFO overflows.
Additional context Similar data loss issues also appear to be present in the implementation of LPUART. Whenever transmit/receive functions are called from an RTOS task, LPUART interrupts are disabled/enabled and if a task switch occurs or some higher priority task is running, received data on the LPUART can be lost.