siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.8k stars 543 forks source link

add SocketCAN kernel module #7686

Closed erickaby closed 4 months ago

erickaby commented 1 year ago

Feature Request

I have a use case that requires canbus, from my limited understanding/investigation I believe that the kernel module needs to be added for the CAN protocol to be available.

Description

Simply I need the below configuration to work. I had no issue with this setup running on k3s cluster on some raspberry pi4s and i want to replicate that it with Talos.

# /etc/network/interfaces.d/can0
allow-hotplug can0
iface can0 can static
    bitrate 1000000
    up ifconfig $IFACE txqueuelen 128

I read up on https://www.talos.dev/v1.5/advanced/proprietary-kernel-modules/ to add it myself however due to lack of understanding of this subject and maybe information that is outdated on the page I was unable to see this through.

frezbo commented 1 year ago

This is the kernel config used by Talos: https://github.com/siderolabs/pkgs/blob/main/kernel/build/config-amd64

the modules can be shipped as a Talos System Extension

It might also need some changes in Talos side to probably support interface options like bitrate

erickaby commented 1 year ago

Thanks I'll keep digging into the codebase and attempt building it again

erickaby commented 1 year ago

@frezbo So, I gave it another shot. However I am still facing issues and not being able to see the interface can0 available under ls /sys/class/net with hostNetwork: true on a pod. These are the steps I am taking and I appreciate any help that you can give me to solve this.

I am mostly following this guide.

  1. Cloned siderolabs/pkgs.git and ran make kernel-menuconfig USERNAME=erickaby

  2. Modified the CAN bus options

    [*] Networking support --->
        <M> CAN bus subsystem support ---> 
        <M> Raw CAN Protocol (raw access with CAN-ID filtering)
        <M> CAN Device Drivers --->
  3. Resulting in changes in /kernel/build/config-arm64 like you linked above (this is for the arm platform for RPI)

  4. Built and pushed the kernel to my registry, make kernel USERNAME=erickaby PLATFORM=linux/arm64 PUSH=true

  5. Created and customised the multi-stage Dockerfile.

    Dockerfile ```Dockerfile FROM scratch AS customization # this is needed so that Talos copies base kernel modules info and default modules shipped with Talos COPY --from=ghcr.io/erickaby/kernel:v1.5.0-dirty /lib/modules /kernel/lib/modules # this copies over the custom modules COPY --from=ghcr.io/erickaby/kernel:v1.5.0-dirty /lib/modules /lib/modules FROM ghcr.io/siderolabs/installer:v1.5.1 COPY --from=ghcr.io/erickaby/kernel:v1.5.0-dirty /boot/vmlinuz /usr/install/${TARGETARCH}/vmlinuz ```
  6. Build and pushed the above docker image to my registry

    DOCKER_BUILDKIT=0 docker build --build-arg RM="/lib/modules" -t ghcr.io/erickaby/talos/installer:v1.5.1-canbus.1 .

So from here I assume that the kernel would have CAN modules. What is the best way to verify at this point?

I then went on to create a talos extension to enable these modules.

  1. Cloned siderolabs/talos-extensions.git and created new module under /network/canbus.

  2. I copied the setup from a similar extension (nvidia) and configured the files to match what i am doing

    pkg.yaml ```yaml name: canbus variant: scratch shell: /toolchain/bin/bash dependencies: - stage: base # The pkgs version for a particular release of Talos as defined in # https://github.com/siderolabs/talos/blob//pkg/machinery/gendata/data/pkgs # - image: "{{ .LOCAL_REPO_PREFIX }}/canbus:{{ .BUILD_ARG_PKGS }}" steps: - prepare: - | sed -i 's#$VERSION#{{ .VERSION }}#' /pkg/manifest.yaml - install: - | mkdir -p /rootfs/lib/modules \ /rootfs/usr/local/lib/modprobe.d cp /pkg/files/canbus.conf /rootfs/usr/local/lib/modprobe.d/canbus.conf # cp -R /lib/modules/* /rootfs/lib/modules finalize: - from: /rootfs to: /rootfs - from: /pkg/manifest.yaml to: / ```
  3. I then build and pushed to my registry make canbus PLATFORM=linux/arm64 TAG=v1.0.0 PUSH=true

I am not sure if this approach is correct, I am just trying things at this point.

To apply to the actual node I tried a few things in the config file:

install:
    disk: /dev/mmcblk0
    image: ghcr.io/erickaby/talos/installer:v1.5.1-canbus.1
    wipe: false
    extensions:
        # - image: ghcr.io/siderolabs/qemu-guest-agent:8.0.2
        - image: ghcr.io/ogkevin/rpi-boot-config-loader:v1.0.0
        - image: ghcr.io/erickaby/canbus:v1.0.0
kernel:
    modules:
    - name: can
    - name: can_raw
    # - name: mttcan

This is the first time I have messed around with customising any kernel and from the few guides [1][2] out there on enabling CAN bus on linux it looks pretty straight forward.

Any help to guide me in the right direction or tools to help debug would be much appreciated.

frezbo commented 1 year ago

Modified the CAN bus options

did you enable as a module (=m) or built in (=y)?

erickaby commented 1 year ago

I actually tried both options with no success, I have quite a few enabled for CAN_USB. I was thinking of stripping it down to just CONFIG_CAN=m and CONFIG_CAN_DEV=m to eliminate any issues enabling the other could have but I didn't try that yet. Should I be using building as a module? Or into the kernel?

frezbo commented 1 year ago

For testing set it as =y, it makes things easier. you can check if the module is in the final kernel by running talosctl read /lib/modules/<kernel-version>-talos/modules.dep (only when built with =y)

frezbo commented 1 year ago

For testing set it as =y, it makes things easier. you can check if the module is in the final kernel by running talosctl read /lib/modules/<kernel-version>-talos/modules.dep (only when built with =y)

this means no need to build the extension (this is probably easier to test).

If you could sent over the Kconfig changes, I can built kernel and sent a custom installer image to test

erickaby commented 1 year ago

Thanks for the info, has helped me confirm and understand more.

From more digging i've found exactly what I need[1][2] enabled, with its dependencies.

CONFIG_CAN=y
CONFIG_CAN_DEV=y
CONFIG_CAN_NETLINK=y
CONFIG_CAN_GS_USB=y # candlelight, used for the BTT u2c

Building the kernel into the installer and upgrading the node I run the following command like you mentioned. Unfortunately I cannot see which Im assuming will be some files like can.ko listed.

talosctl -n 10.5.2.1 read /lib/modules/6.1.46-talos/modules.dep

So, few questions come to mind.

frezbo commented 1 year ago

When using the installer with the custom kernel changes, will a talosctl upgrade be enough to get the changes or will It need a fresh install?

yes, an upgrade would be enough

This is running on a raspberry pi is there anything specific to the RPI that is missing?

hmm, I'm not so sure about that, might need some extra settings in config.txt???

Since I am building them into the binary with =y, is there any need for any config changes in the node configuration? e.g. the below kernel config and extensions can all be commented out.

Since these are builtin, no need of extensions or kernel modules to be explicitly enabled

frezbo commented 1 year ago

Could you try this installer image: ghcr.io/frezbo/installer:v1.6.0-alpha.0-23-gda73b563d-dirty

erickaby commented 1 year ago

Thanks, I have installed with your image above with no success.

This is the file i am expecting to see under talosctl -n 10.5.2.1 read /lib/modules/6.1.46-talos/modules.dep

kernel/drivers/net/can/usb/gs_usb.c https://github.com/torvalds/linux/blob/master/drivers/net/can/usb/gs_usb.c

I have attempted this on both arm64 (rpi4) and amd64 (server vm) platforms just in case.

frezbo commented 1 year ago

might need to enable something extra, this is the kernel built with:

CONFIG_CAN=y
CONFIG_CAN_DEV=y
CONFIG_CAN_NETLINK=y
CONFIG_CAN_GS_USB=y
frezbo commented 1 year ago

also it seems there's a lot more CAN options:

diff --git a/kernel/build/config-amd64 b/kernel/build/config-amd64
index 750f564..02f2d67 100644
--- a/kernel/build/config-amd64
+++ b/kernel/build/config-amd64
@@ -1655,6 +1655,7 @@ CONFIG_NET_EMATCH_NBYTE=y
 CONFIG_NET_EMATCH_U32=y
 CONFIG_NET_EMATCH_META=y
 CONFIG_NET_EMATCH_TEXT=y
+# CONFIG_NET_EMATCH_CANID is not set
 CONFIG_NET_EMATCH_IPSET=y
 # CONFIG_NET_EMATCH_IPT is not set
 CONFIG_NET_CLS_ACT=y
@@ -1730,7 +1731,12 @@ CONFIG_NET_FLOW_LIMIT=y
 # end of Networking options

 # CONFIG_HAMRADIO is not set
-# CONFIG_CAN is not set
+CONFIG_CAN=y
+CONFIG_CAN_RAW=y
+CONFIG_CAN_BCM=y
+CONFIG_CAN_GW=y
+# CONFIG_CAN_J1939 is not set
+# CONFIG_CAN_ISOTP is not set
 # CONFIG_BT is not set
 # CONFIG_AF_RXRPC is not set
 # CONFIG_AF_KCM is not set
@@ -2702,6 +2708,38 @@ CONFIG_SMSC_PHY=m
 # CONFIG_VITESSE_PHY is not set
 # CONFIG_XILINX_GMII2RGMII is not set
 # CONFIG_PSE_CONTROLLER is not set
+CONFIG_CAN_DEV=y
+# CONFIG_CAN_VCAN is not set
+# CONFIG_CAN_VXCAN is not set
+CONFIG_CAN_NETLINK=y
+CONFIG_CAN_CALC_BITTIMING=y
+# CONFIG_CAN_CAN327 is not set
+# CONFIG_CAN_KVASER_PCIEFD is not set
+# CONFIG_CAN_SLCAN is not set
+# CONFIG_CAN_C_CAN is not set
+# CONFIG_CAN_CC770 is not set
+# CONFIG_CAN_CTUCANFD_PCI is not set
+# CONFIG_CAN_IFI_CANFD is not set
+# CONFIG_CAN_M_CAN is not set
+# CONFIG_CAN_PEAK_PCIEFD is not set
+# CONFIG_CAN_SJA1000 is not set
+# CONFIG_CAN_SOFTING is not set
+
+#
+# CAN USB interfaces
+#
+# CONFIG_CAN_8DEV_USB is not set
+# CONFIG_CAN_EMS_USB is not set
+# CONFIG_CAN_ESD_USB is not set
+# CONFIG_CAN_ETAS_ES58X is not set
+CONFIG_CAN_GS_USB=y
+# CONFIG_CAN_KVASER_USB is not set
+# CONFIG_CAN_MCBA_USB is not set
+# CONFIG_CAN_PEAK_USB is not set
+# CONFIG_CAN_UCAN is not set
+# end of CAN USB interfaces
+
+# CONFIG_CAN_DEBUG_DEVICES is not set
 CONFIG_MDIO_DEVICE=y
 CONFIG_MDIO_BUS=y
 CONFIG_FWNODE_MDIO=y
erickaby commented 1 year ago

Hmm I also noticed the same thing today, I didn't get to mention it. I followed the menuconfig on this page to enable gsusb. Noticing it set a few other CONFIG variables to =y then I mentioned above. I'm afk atm so can't get the exact ones but I did build and install it with an upgrade however still the same issue.

frezbo commented 1 year ago

Thanks, I have installed with your image above with no success.

This is the file i am expecting to see under talosctl -n 10.5.2.1 read /lib/modules/6.1.46-talos/modules.dep

kernel/drivers/net/can/usb/gs_usb.c https://github.com/torvalds/linux/blob/master/drivers/net/can/usb/gs_usb.c

I have attempted this on both arm64 (rpi4) and amd64 (server vm) platforms just in case.

The installer I sent earlier was for amd64, also did you upgrade via talosctl upgrade --image=ghcr.io/frezbo/installer:v1.6.0-alpha.0-23-gda73b563d-dirty?

erickaby commented 1 year ago

The installer I sent earlier was for amd64, also did you upgrade via talosctl upgrade --image=ghcr.io/frezbo/installer:v1.6.0-alpha.0-23-gda73b563d-dirty?

Ran this one talosctl -n 10.5.1.10 upgrade --image=ghcr.io/frezbo/installer:v1.6.0-alpha.0-23-gda73b563d-dirty with success. Running on my amd64 node.

Then talosctl -n 10.5.1.10 read /lib/modules/6.1.51-talos/modules.dep

The list matches the output from a node with the standard installer.

A bit lost now to why the file kernel/drivers/net/can/usb/gs_usb.c is not appearing in the list. I appreciate your help with this and from what I am doing it makes sense to me that I should expect it to be there.

frezbo commented 1 year ago

maybe try this: ghcr.io/frezbo/installer:v1.6.0-alpha.0-38-gc5bd0ac5c

erickaby commented 1 year ago

maybe try this: ghcr.io/frezbo/installer:v1.6.0-alpha.0-38-gc5bd0ac5c

Okay so... using your above image on amd64 vm.

The file looks to be under the modules.builtin, where as i have been looking at the modules.dep.

talosctl -n 10.5.1.10 read /lib/modules/6.1.51-talos/modules.builtin | grep can
kernel/drivers/net/can/dev/can-dev.ko
kernel/drivers/net/can/usb/usb_8dev.ko
kernel/drivers/net/can/usb/ems_usb.ko
kernel/drivers/net/can/usb/esd_usb.ko
kernel/drivers/net/can/usb/etas_es58x/etas_es58x.ko
kernel/drivers/net/can/usb/gs_usb.ko
kernel/drivers/net/can/usb/kvaser_usb/kvaser_usb.ko
kernel/drivers/net/can/usb/mcba_usb.ko
kernel/drivers/net/can/usb/peak_usb/peak_usb.ko
kernel/drivers/net/can/usb/ucan.ko
kernel/net/can/can.ko
kernel/net/can/can-raw.ko
kernel/net/can/can-bcm.ko
kernel/net/can/can-gw.ko
kernel/net/can/j1939/can-j1939.ko
kernel/net/can/can-isotp.ko

With the above knowledge, does this mean the module has successfully been added?

The next part to this is having can0 available under /sys/class/net

talosctl -n 10.5.1.10 ls /sys/class/net           
NODE        NAME
10.5.1.10   .
10.5.1.10   bond0
10.5.1.10   bonding_masters
10.5.1.10   cilium_host
10.5.1.10   cilium_net
10.5.1.10   cilium_vxlan
10.5.1.10   dummy0
10.5.1.10   eth0
10.5.1.10   ip6tnl0
10.5.1.10   lo
10.5.1.10   lxc4234b6a6cfd0
10.5.1.10   lxc69df7465ba62
10.5.1.10   lxc937f96052f51
10.5.1.10   lxc_health
10.5.1.10   sit0
10.5.1.10   teql0
10.5.1.10   tunl0

I have also tried to follow similar steps as above and check modules.builtin on arm64 (rpi) however that file is not included there, any ideas? what did you do differently in your image above?

frezbo commented 1 year ago

With the above knowledge, does this mean the module has successfully been added?

yes, when you compile the module statically it won't be in modules.dep

The next part to this is having can0 available under /sys/class/net

might need some udev rule or explicitly enabling them maybe?

does talosctl get links show it?

I have also tried to follow similar steps as above and check modules.builtin on arm64 (rpi) however that file is not included there, any ideas? what did you do differently in your image above?

I just built an installer with a kernel with the can modules enabled, it's amd64 only, I can built a arm64 installer if you need to check

erickaby commented 1 year ago

does talosctl get links show it?

It did not however if I can get the arm64 version working I have a few ideas I want to try out.

I just built an installer with a kernel with the can modules enabled, it's amd64 only, I can built a arm64 installer if you need to check

If you can build the arm64 one that will be great, I can then verify if the way I am building the installer with the kernel is correct.

frezbo commented 1 year ago

An arm64 installer: ghcr.io/frezbo/installer:v1.6.0-alpha.0-44-gb580d27da@sha256:9f009b055b8fae619dff0e542cd3132baefd59f9068ed6501833bdab315103aa

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 7 days.