roboticslab-uc3m / teo-main

TEO full-sized humanoid robot: super/meta repository.
http://roboticslab.uc3m.es/roboticslab/robot/teo-humanoid
GNU Lesser General Public License v2.1
4 stars 1 forks source link

CAN bus setup on the new Jetson board (future head PC) #52

Closed PeterBowman closed 2 years ago

PeterBowman commented 3 years ago

Our new head PC will feature a NVIDIA Jetson AGX Xavier mounted on a Rogue carrier board. Hint: this is not the Xavier 8GB, nor the RogueX board. For future reference, instructions for installation and flashing can be found here. Prior to attempting an apt upgrade, please also read this.

This board provides two CAN buses and has built-in support via kernel modules. It seems that the right module is "mttcan", although initially blacklisted (delete /etc/modprobe.d/blacklist-mttcan.conf to enable it on boot). It is also straightforward to configure a CAN interface on boot via systemd, see https://github.com/roboticslab-uc3m/yarp-devices/issues/251#issuecomment-919420954.

At time of writing, I am able to read messages (using 1 Mbps bitrate), but not to send them.

PS the can0 connector is the one farthest from the passive heat sink. Can1 is not configured whatsoever.

PeterBowman commented 3 years ago

I presume the kernel must be properly configured to handle 1 Mbps: forum post, instructions. Also, it might be necessary to adjust the internal clock frequency. Note mttcan is started at 50 MHz.

The Jetson needs to be flashed in order to apply these changes.

More links:

PeterBowman commented 2 years ago

Apparently, according to this guide, we already have everything set and ready. I checked the .dtb files (converted to .dts) and PLLAON seems already configured. The kernel DTB is located at kernel/dtb/tegra194-agx-cti-AGX101.dtb (per cti/xavier/rogue/base.conf). BPMPFW DTBs should be any of these, all use PLLAON by default as well: bootloader/t186ref/tegra194-agx-cti-a0*-bpmp-p2888-a0*.dtb (certain sources point at tegra194-agx-cti-a02-bpmp-p2888-a04.dtb).

In bootloader/t186ref/tegra194-agx-cti-a02-bpmp-p2888-a04.dtb:

clock@can1 {
    allow_fractional_divider = <0x1>;
    allowed-parents = <0x121 0x5b 0x13a 0x5e>;
    clk-id = <0x9>;
};

clock@can2 {
    allow_fractional_divider = <0x1>;
    allowed-parents = <0x121 0x5b 0x13a 0x5e>;
    clk-id = <0xb>;
};

In kernel/dtb/tegra194-agx-cti-AGX101.dtb:

clocks-init {
    compatible = "nvidia,clocks-config";
    status = "okay";

    disable {
        clocks = <0x4 0x9 0x4 0xb>;
    };
};
mttcan@c310000 {
    compatible = "nvidia,tegra194-mttcan";
    reg = <0x0 0xc310000 0x0 0x400 0x0 0xc311000 0x0 0x32 0x0 0xc312000 0x0 0x1000>;
    reg-names = "can-regs", "glue-regs", "msg-ram";
    interrupts = <0x0 0x28 0x4>;
    pll_source = "pllaon";
    clocks = <0x4 0x11c 0x4 0xa 0x4 0x9 0x4 0x5b 0x4 0x5e>;
    clock-names = "can_core", "can_host", "can", "osc", "pllaon";
    resets = <0x5 0x4>;
    reset-names = "can";
    mram-params = <0x0 0x10 0x10 0x20 0x0 0x0 0x10 0x10 0x10>;
    tx-config = <0x0 0x10 0x0 0x40>;
    rx-config = <0x40 0x40 0x40>;
    status = "okay";
    linux,phandle = <0xe4>;
    phandle = <0xe4>;
};

mttcan@c320000 {
    compatible = "nvidia,tegra194-mttcan";
    reg = <0x0 0xc320000 0x0 0x400 0x0 0xc321000 0x0 0x32 0x0 0xc322000 0x0 0x1000>;
    reg-names = "can-regs", "glue-regs", "msg-ram";
    interrupts = <0x0 0x2a 0x4>;
    pll_source = "pllaon";
    clocks = <0x4 0x11d 0x4 0xc 0x4 0xb 0x4 0x5b 0x4 0x5e>;
    clock-names = "can_core", "can_host", "can", "osc", "pllaon";
    resets = <0x5 0x5>;
    reset-names = "can";
    mram-params = <0x0 0x10 0x10 0x20 0x0 0x0 0x10 0x10 0x10>;
    tx-config = <0x0 0x10 0x0 0x40>;
    rx-config = <0x40 0x40 0x40>;
    status = "okay";
    linux,phandle = <0xe5>;
    phandle = <0xe5>;
};

I see another kernel .dtb file with the same name in the bootloader/ directory, also has PLLAON configured.

PeterBowman commented 2 years ago

Perhaps the pinout/wiring is not correct?

PeterBowman commented 2 years ago

These are the contents of /sys/kernel/debug/bpmp/debug/clk/clk_tree: clk_tree.txt.

CAN clocks can be found inside the osc_div section:

clock                                             on       rate bpmp  mrq vdd
-----------------------------------------------------------------------------
osc_div                                            1   38400000    9    0
    pll_aon                                        1  400000000    5    1
        can1                                       1  200000000    3    1 vdd_aon@800000
            can1_host                              1  200000000    1    1 vdd_aon@800000
            can1_core                              1   50000000    1    1 vdd_aon@800000
        can2                                       1  200000000    3    1 vdd_aon@800000
            can2_host                              1  200000000    1    1 vdd_aon@800000
            can2_core                              1   50000000    1    1 vdd_aon@800000
        aon_apb                                    1  200000000    1    0
        aon_cpu_nic                                1  200000000    1    0 vdd_aon@800000
            aon_nic                                1  200000000    1    0

So it looks like everything is correct? Here is all CAN- and PLLAON-related stuff in /sys/kernel/debug/bpmp/debug/clk/: clk.zip.

PeterBowman commented 2 years ago

These registers also have the expected values per https://github.com/hmxf/can_xavier:

sudo busybox devmem 0x0c303000
sudo busybox devmem 0x0c303008
sudo busybox devmem 0x0c303010
sudo busybox devmem 0x0c303018
PeterBowman commented 2 years ago

Okay, found it: the D-sub connections are wrong (cc @smcdiaz). I got deceived since stuff could be actually read from the bus, probably because CAN-H/L has additional connections on the robot's side of the D-sub. The Jetson side needs to be fixed, though.

PeterBowman commented 2 years ago

Sorry, now I got deceived by cangen. It works at first, but it stopped responding after ~12 messages. The TX error counter is freezed as well. Not sure about the connectors anymore.

PeterBowman commented 2 years ago

Okay, after a system reboot the error counter was reset and I could issue a cangen that didn't stop sending messages after a short while (according to candump). Then, I launched the right arm controller and failed in the process of initializing the iPOS drives. With some luck, it managed to keep ID15 up and running, though, allowing me to query the joint acceleration.

To sum up, application-level communication is achieved. I suspect the CAN wire is too lengthy, or connections are perhaps too weak. We'll test again in a few weeks on a new experimental platform.