KevinOConnor / can2040

Software CAN bus implementation for rp2040 micro-controllers
GNU General Public License v3.0
636 stars 63 forks source link

Add mechanism for obtaining low-level can2040 statistics #45

Closed KevinOConnor closed 10 months ago

KevinOConnor commented 1 year ago

This PR adds a new can2040_get_statistics() API interface. It is intended to allow users of can2040 to get insight into how well the CAN bus hardware is performing. For example, to determine if there are many errors and/or retransmit attempts.

This PR is just for discussion. I have not adequately tested this PR at this time.

@linted - FYI, as mentioned in PR #44

-Kevin

linted commented 1 year ago

Hello, This code compiles and work correctly for standard can bus configurations. It is correctly tracking retransmission attempts. I also noticed that for some reason I will always have 1 parse error whenever I start the can bus, but I can't figure out how to get it to increment the parse errors otherwise.

The code doesn't seem to get it to increment the parse or retransmission counters when silent mode is active on a can transceiver either.

KevinOConnor commented 11 months ago

Thanks for the feedback. That's certainly surprising results. The parse_error counter should increment every time a packet is transmitted that isn't acknowledged (data_state_update_ack() -> data_state_go_error() -> cd->stats.parse_error++). So, if a can2040 node is on a bus with no other node, then any attempt at transmit should result in an ever increasing parse_error count.

I don't have a test setup immediately handy, but I'll try to reproduce.

-Kevin

KevinOConnor commented 10 months ago

I was able to setup a test environment locally. To test this, I built a canbus with an rp2040 and an stm32f072 device. The rp2040 was running Klipper (in its usb-to-can bridge mode) and the stm32f072 was running candlelight_fw. I connected both devices to a Linux computer and created two Linux canbus devices for testing (stm32f072 as can0 and rp2040 as can1).

I locally modified the Klipper code to support calling can2040_get_statistics().

I was able to reproduce the spurious parse_error of 1 at startup. I've added a commit to this series to fix that anomaly.

I was not able to reproduce an issue with parse_error not incrementing. On can2040 receive (cansend can0 123#121212121212), rx_total increments as expected. On regular transmits (cansend can1 123#121212121212), both tx_total and tx_attempt increment as expected. On an rp2040 transmit to a bus without a receiver, both tx_attempt and parse_error rapidly increment until the stm32f072 receiver is reenabled. I simulated disabling the stm32f072 receiver by calling sudo ip link set can0 down.

So, the statistics code seems to work as expected, at least in my test cases.

Cheers, -Kevin

P.S. If curious, these are the changes to the Klipper code to enable querying of the canbus stats (via Klipper's console.py tool):

--- a/src/rp2040/can.c
+++ b/src/rp2040/can.c
@@ -76,3 +76,14 @@ can_init(void)
                   , CONFIG_RP2040_CANBUS_GPIO_RX, CONFIG_RP2040_CANBUS_GPIO_TX);
 }
 DECL_INIT(can_init);
+
+void
+command_get_canbus_stats(uint32_t *args)
+{
+    struct can2040_stats stats;
+    can2040_get_statistics(&cbus, &stats);
+    sendf("canbus_stats rx=%u tx=%u ta=%u pe=%u"
+          , stats.rx_total, stats.tx_total, stats.tx_attempt
+          , stats.parse_error);
+}
+DECL_COMMAND(command_get_canbus_stats, "get_canbus_stats");