KevinOConnor / can2040

Software CAN bus implementation for rp2040 micro-controllers
GNU General Public License v3.0
667 stars 66 forks source link

can20040 library to use other core #22

Closed HFF-Git closed 1 year ago

HFF-Git commented 1 year ago

Hello Kevin,

great work. Thanks for building the CAN interface. The days of the MCP2515 as the bread and butter CAN controller are counted :-). I used your library in a project in an Arduino. Other people already wrote about the "extern C" setting in the include file. Perhaps you could put it in the header file. It would be big help.

In the wrapper I wrote for my projects ( they allows to use a MCP2515 or your library ), the implementation uses a small inbound queue from the PICO SDK to store away incoming messages. This way the paradigm allows for a "check availability" and "receive" approach. I was wondering if it would make sense to make this kind of queue the default callback handler in case users do not want to write their own callback handler and rather use the "check / receive" paradigm just as they do in Serial IO on Arduino.

Finally ( to the point of this writing ), do you plan or have any insight to use the other core on the PICO for your CAN bus library. I can imagine that your outbound queue and an inbound queue as just described above would build a nice decoupling between the two cores. But I do not know how to set up the "running" of your code on the other core. Any help would be greatly appreciated.

Thanks gain for the great library, Helmut

KevinOConnor commented 1 year ago

Thanks for the feedback.

One of the reasons I went with the current implementation is due to "rx filtering". I suspect nearly all users of the library will need to identify messages of interest to them and discard the rest. A filtering API could have been implemented in can2040 itself, but the filtering would need to be implemented in ARM core C code. I felt it would likely be more work for a user to code to a "filtering API" than to just "filter themselves". However, that does mean that the user will likely want to filter prior to placing messages on their "message queue".

It's possible the documentation could provide some examples on simple queues. I tend to use queues myself. However, the "task wakeup" part of a queue is heavily dependent on the user's preferred "build system" (eg, arduino, pico sdk, bare metal, freertos). So, it's hard to provide meaningful examples to users.

As for creating a thread on the second core - that depends heavily on the build system in use. I typically write bare metal, but I know it's possible with the pico sdk. I'd guess the arduino code has some ability (as well as freertos, etc.). I'm not familiar with all the different embedded OS variants, so I can't really provide much advice.

Cheers, -Kevin

HFF-Git commented 1 year ago

Hello Kevin,

very valid comments, thanks. Yes, it is very useful to filter messages upfront. Completely agree. However, combining filter and queues is not necessarily a conflict. I have written up a small chapter for your doc with the example that I use in my work. It just shows your code augmented with the queue and how such a combined model could be used by the Arduino folks. Perhaps you also have plans to make your coding available as an Arduino library. All that is needed is a "properties" description, let me know if you need help here.

I have attached a markdown file with the example for a queuing example. Feel free to add something like this to your doc, just in case.

Chewers, Helmut

can2040-receivequeue-example.md

HFF-Git commented 1 year ago

Hello Kevin,

I got a multi-core version running. Based on PICO SDK. In order to build it, I took your files and added the necessary changes.

1.) can2040.h -> added the include "pico/lock_core.h". 2.) can2040.h -> added a configuration structure which contains all the data needed to do your setup. The structure is also a static structure, all data to do the setup and status is in here. It is similar to your can2040 structure and needed for the routine to launch to the other core. 3.) can2040.h -> added a spin lock data structure to the "can2040" struct. The reason is that we need to control access to the transmit queue. A writer on Core0 and a reader on Core1 need to be synchronized. 4.) can2040.h -> added the spin lock and unlock in the "can2040_check_transmit" routine. 5.) can2040-c -> added the spin lock and unlock calls in the can2040_transmit" routines.

That's it. All of these changes also work for the single core version. Tested it.

In my library I have the following routines to use your libraries with the above changes.

( I use a receiver queue to call from the rx callback, bit that is an optional item... rx callback is outside of the scope of these changes )


//---------------------------------------------------------------------------------------------------------- // "canBusCore" is the routine that encapsulates the CANBUS setup and launch work. For the multi-core // version it needs to be a routine that can be called from the current core or be launched the other core. // The routine communicates the successful setup with a boolean flag in the configuration descriptor. // Note that the setup routine must be a void procedure with no parmeters. This is expected by the launch // routine in the PICO SDK. //---------------------------------------------------------------------------------------------------------- void canBusSetup( ) {

#if DEBUG_CAN_BUS == 1
INTERFACE.print( "canBusSetup -> " );
INTERFACE.print( "pioNum:" );
INTERFACE.print( cfg.mcPioNum );
INTERFACE.print( ", sysClk:" );
INTERFACE.print( cfg.mcSysClock );
INTERFACE.print( ", bitRate:" );
INTERFACE.print( cfg.mcBitRate );
INTERFACE.print( ", rxPin:" );
INTERFACE.print( cfg.mcRxPin );
INTERFACE.print( ", txPin:" );
INTERFACE.print( cfg.mcTxPin );
INTERFACE.print( ", cb:" );
INTERFACE.print((uint32_t) cfg.mcRxCallback );
INTERFACE.print( ", rxQS:" );
INTERFACE.print( cfg.mcRxQueueSize );
INTERFACE.print( ", MC:" );
INTERFACE.println( cfg.mcRunOnCore1 );
#endif

queue_init( &rxQueue, sizeof( can2040_msg ), cfg.mcRxQueueSize );

can2040_setup( &cBus, cfg.mcPioNum );
can2040_callback_config( &cBus, cfg.mcRxCallback );

if ( cfg.mcPioNum == 0 ) {

  irq_set_exclusive_handler( PIO0_IRQ_0, CanBusPIOIrqHandler );
  irq_set_priority( PIO0_IRQ_0, 1 );
  irq_set_enabled( PIO0_IRQ_0, true );
}
else if ( cfg.mcPioNum == 1 ) {

  irq_set_exclusive_handler( PIO1_IRQ_0, CanBusPIOIrqHandler );
  irq_set_priority( PIO1_IRQ_0, 1 );
  irq_set_enabled( PIO1_IRQ_0, true );
}

can2040_start( &cBus, cfg.mcSysClock, cfg.mcBitRate, cfg.mcRxPin, cfg.mcTxPin );

cfg.mcSetupOK = true;

#if DEBUG_CAN_BUS == 1
INTERFACE.println( F( "CAN Bus Initialized" ));
#endif

#if DEBUG_CAN_BUS == 1
INTERFACE.print( F( "Runs on Core: " ));
INTERFACE.println( get_core_num( ));
#endif

if ( cfg.mcRunOnCore1 ) {

  while ( true ) tight_loop_contents( );
}

}


Extract from the Init methods of my CANbus lib object:

-> set up the conic structure:

cfg.mcSetupOK = true; cfg.mcRxPin = rxPin; cfg.mcTxPin = txPin; cfg.mcRxCallback = canBusEventCallback; cfg.mcSysClock = CDC::cpuFrequency( ); // gets the CPU frequency. cfg.mcPioNum = 0; cfg.mcRxQueueSize = RX_QUEUE_SIZE;

cfg.mcRunOnCore1 = (( fMode == CAN_BUS_LIB_125K_M_CORE ) || ( fMode == CAN_BUS_LIB_250K_M_CORE ) || ( fMode == CAN_BUS_LIB_500K_M_CORE ) || ( fMode == CAN_BUS_LIB_1000K_M_CORE ));

switch ( fMode ) {

case CAN_BUS_LIB_125K:
case CAN_BUS_LIB_125K_M_CORE:   cfg.mcBitRate = 125000; break;

case CAN_BUS_LIB_250K:
case CAN_BUS_LIB_250K_M_CORE:   cfg.mcBitRate = 250000; break;

case CAN_BUS_LIB_500K:
case CAN_BUS_LIB_500K_M_CORE:   cfg.mcBitRate = 500000; break;

case CAN_BUS_LIB_1000K:
case CAN_BUS_LIB_1000K_M_CORE:  cfg.mcBitRate = 1000000; break;

default:                        cfg.mcSetupOK = false;

}

-> if config is OK perform a multi-core launch or a single core setup. The called routine is the same, the ionly difference is that on a multi-core launch the setup routine should not return. if ( cfg.mcSetupOK ) {

if ( cfg.mcRunOnCore1 ) multicore_launch_core1( canBusSetup );
else canBusSetup( );

}


Well, that's it. Here are the changed files, just for your reference.

can2040.c - new.pdf can2040.h - new.pdf

My question is if you would consider to put in the spin lock protection for the transmit queue and the config structure ( or a similar one ).

Since I am not an expert in PIO programming, you may want to look into whether other areas should be protected. I would think not, but it is your code, you know best.

Also, passing the sys_clock to the setup calls may not be necessary, you can easily get it from the PICO SDK. Nothing a use can change away.

Let me know what you think and how I can help.

Best regards Helmut

HFF-Git commented 1 year ago

Sorry, I filed the wrong version of can2040.c. Not all spin locks have been put in place there. This file version now has all locations covered. Again the game is about protecting access to the transmit queue. Also I apologize, I am just getting into using GIT. Perhaps I should clone the original, make the changes and then file a request ? Any advice would be appreciated. For now, I just send the modified source files. Both versions using can2040 on one core and on two cores have been tested, data with a MCP2515 controller based node is successfully exchanged.

Cheers, Helmut

can2040.c - new.pdf can2040.h - new.pdf

KevinOConnor commented 1 year ago

For what it is worth, a spinlock is not needed to protect the transmit queue. It is possible to make the queue atomic by adding memory flush instructions and double checking the read of tx_push_pos in tx_schedule_transmit().

Using spinlocks as implemented above would notably increase irq latency (can2040_transmit() is cpu intensive). It would also add a dependency on the pico sdk (not all build environments use the sdk).

Perhaps you could give some background on how you plan to use both cores in your particular application.

-Kevin

HFF-Git commented 1 year ago

Hello Kevin,

certainly fine by me. If there is another way to ensure that the two cores do not corrupt the transmit queue through concurrent access, perfect. You know your code best. I also was not too thrilled to spin lock across most of can2040_transmit.

I am not too familiar with the rp2040 hardware layer. Can you sketch what you have to ensure access from both cores? Greatly appreciated.

For the background: I am building a model railway control system. For the base station node there is in addition to handling the messages in the can bus, the requirement to generate the digital signals for the power units, which is CPU intensive. The basic implementation is to run timers on a 29 micro second resolution clock to generate these signals.

For the CAN bus itself, there could be quite a few nodes, let’s say anything from 10 to 100 nodes on larger layouts. The first design was on Atmega and the MCP2515 Controller. Using your library i can remove the MCP2515 and save cost per node.

On the PICO we have the opportunity to run the application in one core and your implementation on the other. Since you already hinted at potential performance issues in the documentation when doing too much in the callbacks, etc. why not use the other core…

HFF-Git commented 1 year ago

Hello Kevin,Based on your feedback, I looked at the coding again. For what I am building: the CanBus runs on core 1 and the app on core 0. A can2040_transmit will only be called from core 0 and is the only routine that modifies the tx_push_pos field. The value to the next free slot will be set after the processing of the message data was done. The dequeuing and adjusting of tx_pull_pos takes solely place on core 1. From what I can tell, there is no scenario where both cores would go after one end of the queue at the same time. You are right, a spin lock is not needed when there is an atomic modification of the two position indexes. Is this where the barrier comes into play ?Am I correct with the assumptions ?The receive callback also runs on core 1 and in my use case just queues the message received to a receiver queue. This queue is written by core 1 and read by core 0. The pico library does take care of the concurrent access to the queue. Again, your feedback is greatly appreciated. BestHelmutSent from my iPadOn 15. Jan 2023, at 17:59, KevinOConnor @.***> wrote: For what it is worth, a spinlock is not needed to protect the transmit queue. It is possible to make the queue atomic by adding memory flush instructions and double checking the read of tx_push_pos in tx_schedule_transmit(). Using spinlocks as implemented above would notably increase irq latency (can2040_transmit() is cpu intensive). It would also add a dependency on the pico sdk (not all build environments use the sdk). Perhaps you could give some background on how you plan to use both cores in your particular application. -Kevin

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

KevinOConnor commented 1 year ago

Right. Something like the following should ensure atomic access to the queue (totally untested):

--- a/src/can2040.c
+++ b/src/can2040.c
@@ -29,12 +29,12 @@

 // Helper functions for writing to "io" memory
 static inline void writel(void *addr, uint32_t val) {
-    barrier();
+    __DSB();
     *(volatile uint32_t *)addr = val;
 }
 static inline uint32_t readl(const void *addr) {
     uint32_t val = *(volatile const uint32_t *)addr;
-    barrier();
+    __DSB();
     return val;
 }

@@ -647,11 +647,14 @@ tx_schedule_transmit(struct can2040 *cd)
     if (cd->tx_state == TS_QUEUED && !pio_tx_did_fail(cd))
         // Already queued or actively transmitting
         return 0;
-    if (cd->tx_push_pos == cd->tx_pull_pos) {
+    if (readl(&cd->tx_push_pos) == cd->tx_pull_pos) {
         // No new messages to transmit
         cd->tx_state = TS_IDLE;
         pio_signal_clear_txpending(cd);
-        return SI_TXPENDING;
+        if (likely(readl(&cd->tx_push_pos) == cd->tx_pull_pos))
+            return SI_TXPENDING;
+        // Raced with can2040_transmit() - msg is now available for transmit
+        pio_signal_set_txpending(cd);
     }
     cd->tx_state = TS_QUEUED;
     struct can2040_transmit *qt = &cd->tx_queue[tx_qpos(cd, cd->tx_pull_pos)];

It would still be necessary for the caller to ensure that can2040_transmit() isn't called simultaneously from multiple cores (nor called reentrant from irq context while already in can2040_transmit() on the same core), but it would be odd for a caller to do that anyway.

-Kevin

KevinOConnor commented 1 year ago

Technically, the write to tx_pull_position should also be explicitly ordered (though I doubt it would matter in practice):

--- a/src/can2040.c
+++ b/src/can2040.c
@@ -29,12 +29,12 @@

 // Helper functions for writing to "io" memory
 static inline void writel(void *addr, uint32_t val) {
-    barrier();
+    __DSB();
     *(volatile uint32_t *)addr = val;
 }
 static inline uint32_t readl(const void *addr) {
     uint32_t val = *(volatile const uint32_t *)addr;
-    barrier();
+    __DSB();
     return val;
 }

@@ -647,11 +647,14 @@ tx_schedule_transmit(struct can2040 *cd)
     if (cd->tx_state == TS_QUEUED && !pio_tx_did_fail(cd))
         // Already queued or actively transmitting
         return 0;
-    if (cd->tx_push_pos == cd->tx_pull_pos) {
+    if (readl(&cd->tx_push_pos) == cd->tx_pull_pos) {
         // No new messages to transmit
         cd->tx_state = TS_IDLE;
         pio_signal_clear_txpending(cd);
-        return SI_TXPENDING;
+        if (likely(readl(&cd->tx_push_pos) == cd->tx_pull_pos))
+            return SI_TXPENDING;
+        // Raced with can2040_transmit() - msg is now available for transmit
+        pio_signal_set_txpending(cd);
     }
     cd->tx_state = TS_QUEUED;
     struct can2040_transmit *qt = &cd->tx_queue[tx_qpos(cd, cd->tx_pull_pos)];
@@ -717,7 +720,7 @@ report_callback_rx_msg(struct can2040 *cd)
 static void
 report_callback_tx_msg(struct can2040 *cd)
 {
-    cd->tx_pull_pos++;
+    writel(&cd->tx_pull_pos, cd->tx_pull_pos + 1);
     cd->rx_cb(cd, CAN2040_NOTIFY_TX, &cd->parse_msg);
 }

-Kevin

KevinOConnor commented 1 year ago

I created PR #24 for discussion on the modification. (It also has a slightly better implementation.)

-Kevin

HFF-Git commented 1 year ago

Thanks!!!

I will just wait for the new version and test it.

I am also happy to volunteer a small example how to setup core 1 for the multiple-core scenario to the documentation if you want.

Best Helmut

if (likely(readl(&cd->tx_push_pos) == cd->tx_pull_pos ))

Am 16.01.2023 um 03:08 schrieb KevinOConnor @.***>:

I created PR #24 https://github.com/KevinOConnor/can2040/pull/24 for discussion on the modification. (It also has a slightly better implementation.)

-Kevin

— Reply to this email directly, view it on GitHub https://github.com/KevinOConnor/can2040/issues/22#issuecomment-1383361106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP7SZJG7CMYDHT3HSKSMSOTWSSUS3ANCNFSM6AAAAAATXXEC5A. You are receiving this because you authored the thread.

KevinOConnor commented 1 year ago

I don't have a good test setup for PR #24. So, if it is useful in your setup, it would help if you could provide feedback on it.

Cheers, -Kevin

HFF-Git commented 1 year ago

Hello Kevin,I already did a quick check right after you created the branch. A simple test sending and receiving data packets ran fine for hours. As you pointed out, the use case is not that both cores can send and receive, but delegate the CAN protocol exclusively to one core. So we only worry about the send path from core 0 to core 1, assuming we run the CAN protocol on core 1.I am not too familiar with the memory barrier concept of ARM to identify all corner cases. In my use case, your „output“ queue is filled by core 0 and depleted by core 1. As long as both cores see the entry pointer atomically to make their decisions, it should be fine. Let me just do some more stress tests. The initial test showed no issue so far.Thanks again for writing this library, great stuff.Best,HelmutSent from my iPadOn 23. Feb 2023, at 17:43, KevinOConnor @.***> wrote: I don't have a good test setup for PR #24. So, if it is useful in your setup, it would help if you could provide feedback on it. Cheers, -Kevin

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

HFF-Git commented 1 year ago

Hello Kevin,

I did check again. All is still running fine. I do have a simple loop between two CAN nodes using my core library. The sending at 500kB works reliable and as said for hours non-stop. Messages received come in on the dedicated core are send via a queue to the application core with no issues that I can tell.

I am not sure what else I can test. For now, therefore OK from my side :-). When you have merged the changes back into the main line, I am more than happy to run the test again.

Best regards, Helmut

Am 23.02.2023 um 17:43 schrieb KevinOConnor @.***>:

I don't have a good test setup for PR #24 https://github.com/KevinOConnor/can2040/pull/24. So, if it is useful in your setup, it would help if you could provide feedback on it.

Cheers, -Kevin

— Reply to this email directly, view it on GitHub https://github.com/KevinOConnor/can2040/issues/22#issuecomment-1442093726, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP7SZJBYERESGDC3NSKVGZLWY6HRXANCNFSM6AAAAAATXXEC5A. You are receiving this because you authored the thread.

KevinOConnor commented 1 year ago

Great, thanks. I think we can merge that PR. It's probably a few weeks out before that happens though (just due to scheduling constraints on my side).

-Kevin

HFF-Git commented 1 year ago

No problem, any time you are ready. I will then test one more time ...

Cheers Helmut

KevinOConnor commented 1 year ago

FYI, I made some changes to PR #24 and committed those changes to the master branch.

-Kevin

HFF-Git commented 1 year ago

Perfect, I will test one more time. But it may take a while, I am currently a little busy ...

Best regards Helmut

github-actions[bot] commented 1 year ago

Hello,

It looks like there hasn't been any recent updates on this github ticket. We prefer to only list tickets as "open" if they are actively being worked on. Feel free to provide an update on this ticket. Otherwise the ticket will be automatically closed in a few days.

Best regards,

~ Your friendly GitIssueBot

PS: I'm just an automated script, not a human being.