KevinOConnor / can2040

Software CAN bus implementation for rp2040 micro-controllers
GNU General Public License v3.0
667 stars 66 forks source link

NAK when connecting can2040 on PIO0 to can2040 on PIO1 #38

Closed Nate711 closed 1 year ago

Nate711 commented 1 year ago

Thanks for making this library!

I was doing a simple test where I set up a can2040 on PIO0 and another on PIO1 and then wired them together, using SN65HVD1040 breakout boards as the transceivers. However, when I analyzed the signals I kept seeing messages getting resent because of no acknowledgement NAK. At 1mbps bitrate and 100hz message frequency, the CAN messages get resent like this

Screenshot 2023-05-04 at 12 09 34 AM

It seemed that whether NAK occurred or not depended on the bitrate and message frequency. At 1mbps and 10hz and 100hz message rates I saw the repeat messages, sometimes up to 3 attempts. However at 1khz message rate, I did not see any repeat messages. At 500 kbps and 10 hz, 100 hz, and 1khz message rates I did not see any repeats either.

My code is here: https://github.com/Nate711/can2040-test, in particular, the dual_can_test.ccexecutable

Any guess as to why this is happening at the lower message rates at 1mbps? IRQ latency due to both CAN network IRQs triggering at the same time? Anything I can do to mitigate this? I should also add that my end goal is to control actuators via 2 CAN networks from one Pico, so maybe the IRQs won't be firing in unison so much in my real application. Thanks!

KevinOConnor commented 1 year ago

Certainly, irq latency could be an explanation for the issue. Alas, it's hard to give advice on possible causes of irq latency.

There are some things worth checking:

  1. Make sure you are on the latest version of the can2040 code. There were some tx timing code updates earlier today that could impact your test (#32).
  2. Make sure the canbus has correct wiring - including two 120 ohm resistors ( https://github.com/KevinOConnor/can2040/blob/master/docs/Tools.md#testing-can-bus ).

If it is irq latency, then it may be possible to move each pio to its own rp2040 arm core. That should improve irq latency with respect to can2040 irqs from each PIO, but would likely require additional application work on your side to handle multiple ARM cores.

If you think it's something in the can2040 software, you could try to reproduce it with known working software - Klipper - see: https://github.com/KevinOConnor/can2040/blob/master/docs/Tools.md . Klipper only runs one can2040 PIO though.

Cheers, -Kevin

Nate711 commented 1 year ago

Thanks! I'll retry with the updated code

Nate711 commented 1 year ago

Problems solved!

Not only did the repeat messages go away at 1mbps, I no now longer get invalid messages at 500kbps. Previously, my logic analyzer said all the messages during my tests at 500kbps were invalid CAN messages. I thought this was a fluke because the messages were received and parsed correctly by the receiving can2040. However, it seems that after using this updated code (my previous tests were done with the latest code at the time), the 500kbps messages are read as valid by the logic analyzer.

With previous can2040 code:

Screenshot 2023-05-05 at 11 34 41 PM

With updated code:

Screenshot 2023-05-05 at 11 34 48 PM