Closed ghost closed 1 year ago
Also see heavy CPU loading here when doing RX load testing. Unfortunately, SPI on Linux is rather high-overhead. Every time you call the SPI core with a request, it costs you, so anything that can be done to group together or reduce the number of requests helps.
Reviewing the driver, there is one easy optimization that can be made, which is to reduce the number of separate SPI core requests when setting the UINC bit in the RX FIFO, and to instead batch them up into a single SPI core request. I'm attaching a rough quick patch, might need to be made more pretty but Marc can first comment whether this sort of thing feels acceptable... mcp25xxfd-rx-fifo-finalize.txt
Hey @umaplehurst,
that patch looks interesting. Nice idea, see my initial review here: https://github.com/marckleinebudde/linux/commit/d113bb6a3adc92325fa40e6928fe3dc38c4ac98d
Hey @alexch2018,
CAN-FD is not yet optimized, as I first wanted to get the driver stable and upstream. This is done, now it's time for optimization :smile:
Marc, we use CAN-FD and CPU is pinned at 600MHz currently, no turbo. The tests described above were done with Raspbian on CM3+. We are now working on a separate set of tests with Alpine Linux, will report results when the tests are done. And thank you both for figuring out that patch, we will include that into our latest test.
Here's an updated version of the patch:
https://github.com/marckleinebudde/linux/commits/mcp251xfd-uinc
FYI: this patch went mainline with https://github.com/marckleinebudde/linux/commit/68c0c1c7f9668e7a7f2e18dbf951cfee57af1c0e. The same idea was ported to the RX path https://github.com/marckleinebudde/linux/commit/1f652bb6bae7f211f3131ddbc380bb839680068f. And in https://github.com/marckleinebudde/linux/commit/eb94b74ccda607f3c0e441d793ff9f90fc3b09ea the UINC
handling was improved.
Hello, thank you for your effort on improving this CAN driver! I have a question on CPU load that I observe doing tests under load. With just over 3000 inbound frames per second the irq process consumes over 47% CPU and spi process consumes over 24% CPU running on CM 3+.
In my project we plan to disable dynamic frequency for CM to have more predictable behavior, and when we do that these two processes consume more than a full core just receiving this data stream.
Is there a way to optimize this or is there a fundamental reason why this cannot be optimized?