OpenAMP / openamp-system-reference

End-to-end system reference material showcasing all the different aspects of OpenAMP, on multiple vendor platforms.
Other
16 stars 15 forks source link

rpmsg-multi-service zephyr firmware hangs with petalinux build flow #52

Closed olneumann closed 1 month ago

olneumann commented 2 months ago

Hi all,

I was trying to get the rpmsg-multi-service running with the PetaLinux Build Process and the Kria KR260 Evaluation Board (should be very similar to the KV260). But in the end I still get no RX response or the message that a virtio channel was created as it should be like in the following dump. I think it is due to the wrong IPM or IPI settings that I am not very familiar with.

[   54.495343] virtio_rpmsg_bus virtio0: rpmsg host is online
[   54.500044] virtio_rpmsg_bus virtio0: creating channel rpmsg-client-sample addr 0x400
[   54.507923] virtio_rpmsg_bus virtio0: creating channel rpmsg-tty addr 0x401
[   54.514795] virtio_rpmsg_bus virtio0: creating channel rpmsg-raw addr 0x402

Steps to Reproduce

PetaLinux Setup

1) I use the PetaLinux 2024.1 Tools and create a new project, add the Kria KR260 Starter Kit BSP and add the following system-user.dtsi in the plnx_ipc/project-spec/meta-user/recipes-bsp/device-tree folder:

system-user.dtsi ``` /include/ "system-conf.dtsi" / { reserved-memory { #address-cells = <0x02>; // determines the number of cells used to represent a register address in a reg property #size-cells = <0x02>; // determines the number of cells used to represent a register size in a reg property ranges; /* * xlnx_r5_remoteproc.c * * - vdev0buffer has to be used https://github.com/Xilinx/linux-xlnx/blob/d18bb880a47087bb8e49b17780c8265978cecaa5/drivers/remoteproc/xlnx_r5_remoteproc.c#L466C1-L467C47 * - vdev0ringx not specified in the driver * */ rproc_0_fw_image: memory@3ed00000 { no-map; reg = <0x0 0x3ed00000 0x0 0x40000>; }; rpu0vdev0vring0: vdev0vring0@3ed40000 { no-map; reg = <0x0 0x3ed40000 0x0 0x4000>; }; rpu0vdev0vring1: vdev0vring1@3ed44000 { no-map; reg = <0x0 0x3ed44000 0x0 0x4000>; }; rpu0vdev0buffer: vdev0buffer@3ed48000 { no-map; compatible = "shared-dma-pool"; reg = <0x0 0x3ed48000 0x0 0x100000>; }; rproc_1_fw_image: memory@3ef00000 { no-map; reg = <0x0 0x3ef00000 0x0 0x40000>; }; rpu1vdev0vring0: vdev0vring0@3ef40000 { no-map; reg = <0x0 0x3ef40000 0x0 0x4000>; }; rpu1vdev0vring1: vdev0vring1@3ef44000 { no-map; reg = <0x0 0x3ef44000 0x0 0x4000>; }; rpu1vdev0buffer: vdev0buffer@3ef48000 { no-map; compatible = "shared-dma-pool"; reg = <0x0 0x3ef48000 0x0 0x100000>; }; }; /* * IPI mailbox node * see Documentation/devicetree/bindings/mailbox/xlnx,zynqmp-ipi-mailbox.yaml */ zynqmp-mailbox { compatible = "xlnx,zynqmp-ipi-mailbox"; interrupt-parent = <&gic>; interrupts = <0 29 4>; // TODO: check if this is correct xlnx,ipi-id = <7>; // TODO: check if this is correct #address-cells = <1>; #size-cells = <1>; ranges; bootph-all; /* * Note: The software might reassign the interrupt channels and message buffers except * for the PMU interrupts. */ rpu0_apu_mailbox: mailbox@ff990040 { reg = <0xff990040 0x20>, // switched local and remote based on zephyr dts <0xff990060 0x20>, <0xff990200 0x20>, <0xff990220 0x20>; reg-names = "local_request_region", "local_response_region", "remote_request_region", "remote_response_region"; #mbox-cells = <1>; xlnx,ipi-id = <1>; bootph-all; }; rpu1_apu_mailbox: mailbox@ff990080 { reg = <0xff990080 0x20>, // TRM, Table 13-3: channel 3 <0xff9900a0 0x20>, <0xff990400 0x20>, <0xff990420 0x20>; reg-names = "local_request_region", "local_response_region", "remote_request_region", "remote_response_region"; #mbox-cells = <1>; xlnx,ipi-id = <2>; bootph-all; }; }; /* * RPU remoteproc node * see Documentation/devicetree/bindings/remoteproc/xlnx,zynamp-r5fss.yaml */ zynqmp-rpu { compatible = "xlnx,zynqmp-r5fss"; xlnx,cluster-mode = <0>; // split mode #address-cells = <2>; #size-cells = <2>; r5f-0 { compatible = "xlnx,zynqmp-r5f"; power-domains = <&zynqmp_firmware 0x7>; memory-region = <&rproc_0_fw_image>, <&rpu0vdev0buffer>, <&rpu0vdev0vring0>, <&rpu0vdev0vring1>; mboxes = <&rpu0_apu_mailbox 0>, <&rpu0_apu_mailbox 1>; mbox-names = "tx", "rx"; }; r5f-1 { compatible = "xlnx,zynqmp-r5f"; power-domains = <&zynqmp_firmware 0x8>; memory-region = <&rproc_1_fw_image>, <&rpu1vdev0buffer>, <&rpu1vdev0vring0>, <&rpu1vdev0vring1>; mboxes = <&rpu1_apu_mailbox 0>, <&rpu1_apu_mailbox 1>; mbox-names = "tx", "rx"; }; }; }; ```

2) Patching the virtio_rpmsg_bus.c file of the xlnx_rebase_v6.6_LTS_2024.1_update branch. Allowing the xlnx_r5_remoteproc.c to be loaded as kernel module instead of the zynqmp_r5_remoteproc.c one. See this patch. 3) Selecting for the rootfs the packagegroup-petalinux-openamp.bb. 4) Build and flash. 5) Build and install the rpmsg-utils tools as root. 6) Check that before running the Zephyr firmware the kernel modules insmod rpmsg_client_sample.ko rpmsg_tty.ko rpmsg_char.ko rpmsg_ctrl.ko are inserted and running.

Until here I do not get any errors in dmesg.

Zephyr Firmware Setup

1) Using the provide main_remote.c of the rpmsg-multi-services example in a dedicated zephyr project firmware_zynqmp_ipc (branch: backport-zephyr-3.6.0). 2) The custom board is just a mirror of the KV260 board and goes with the following board overlay for the shm directives:

kr260_r5.overlay ``` / { chosen { zephyr,ipc = &rpu0_apu_mailbox; zephyr,ipc_shm = &rpu0_ipc_shm; zephyr,sram = &tcm; /delete-property/ zephyr,flash; }; soc { tcm: memory@0 { compatible = "mmio-sram"; reg = <0 0x40000>; }; }; reserved-memory { compatible = "reserved-memory"; #address-cells = <1>; #size-cells = <1>; status = "okay"; ranges; rpu0_ipc_shm: memory@3ed40000 { reg = <0x3ed40000 0x100000>; }; }; }; &rpu0_ipi { status = "okay"; }; &rpu0_apu_mailbox { status = "okay"; }; /delete-node/ &uart0; /delete-node/ &gem0; /delete-node/ &gem1; /delete-node/ &gem2; /delete-node/ &gem3; ```

3) Loading the .elf to the remoteproc0 and starting it. Terminal gives me the output:

Starting application threads!

OpenAMP[remote]  linux responder demo started

OpenAMP[remote] Linux sample client responder started

OpenAMP[remote] Linux tty responder started

OpenAMP[remote] Linux raw data responder started

OpenAMP[remote] create a endpoint with address and dest_address set to 0x1

4) For reference the zynqmp_rpu.dtsi link which is the baseline for the board overlay.

Further Thoughts

So the firmware gets loaded properly and also is able to create this endpoint based on the output. How can I now proceed and debug the situation? I see in the dmesg output that when i create for example a tty endpoint and send something a similar output to this:

[  115.757439] rpmsg_tty virtio0.rpmsg-tty.257.-1: TX From 0x101, To 0x35, Len 40, Flags 0, Reserved 0
[  115.757497] rpmsg_virtio TX: 01 01 00 00 35 00 00 00 00 00 00 00 28 00 00 00  ....5.......(...
[  115.757514] rpmsg_virtio TX: 72 70 6d 73 67 2d 74 74 79 00 00 00 00 00 00 00  rpmsg-tty.......
[  115.757528] rpmsg_virtio TX: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[  115.757540] rpmsg_virtio TX: 01 01 00 00 00 00 00 00                          ........

but then I didn't get anything further indicating that the remoteproc does something, i.e. from the example:

[  115.757568] remoteproc remoteproc0: kicking vq index: 1
[  115.757590] stm32-ipcc 4c001000.mailbox: stm32_ipcc_send_data: chan:1
[  115.757850] stm32-ipcc 4c001000.mailbox: stm32_ipcc_tx_irq: chan:1 tx
[  115.757906] stm32-ipcc 4c001000.mailbox: stm32_ipcc_rx_irq: chan:0 rx
[  115.757969] remoteproc remoteproc0: vq index 0 is interrupted
[  115.757994] virtio_rpmsg_bus virtio0: From: 0x400, To: 0x101, Len: 6, Flags: 0, Reserved: 0
[  115.758022] rpmsg_virtio RX: 00 04 00 00 01 01 00 00 00 00 00 00 06 00 00 00  ................
[  115.758035] rpmsg_virtio RX: 62 6f 75 6e 64 00                                bound.
[  115.758077] virtio_rpmsg_bus virtio0: Received 1 messages

Any help here? Thank you.

arnopo commented 2 months ago

Hello @olneumann Please provide Zephyr and Linux traces to better understand which message is sent or not.

tnmysh commented 2 months ago

@olneumann From what's provided in traces (virtio0.rpmsg-tty.257.-1) looks like devices are created at linux side. So Demo is working as expected.

Your concern about following logs:

[  115.757590] stm32-ipcc 4c001000.mailbox: stm32_ipcc_send_data: chan:1
[  115.757850] stm32-ipcc 4c001000.mailbox: stm32_ipcc_tx_irq: chan:1 tx
[  115.757906] stm32-ipcc 4c001000.mailbox: stm32_ipcc_rx_irq: chan:0 rx
[  115.757969] remoteproc remoteproc0: vq index 0 is interrupted
[

Above logs are for STM32 platform. From your post, I can say you are using Xilinx Platform, and so above logs are not expected on linux side, as different drivers are used for Xilinx Platform.

For further confirmation, once zephyr firmware starts, please check following path on linux side for rpmsg devices:

ls /sys/bus/rpsmg/devices/*

You should see rpmsg devices created.

tnmysh commented 2 months ago

Check that before running the Zephyr firmware the kernel modules insmod rpmsg_client_sample.ko rpmsg_tty.ko rpmsg_char.ko rpmsg_ctrl.ko are inserted and running.

Have you inserted, rpmsg_ns.ko and virtio_rpmsg_bus.ko drivers?

olneumann commented 2 months ago

Check that before running the Zephyr firmware the kernel modules insmod rpmsg_client_sample.ko rpmsg_tty.ko rpmsg_char.ko rpmsg_ctrl.ko are inserted and running.

Have you inserted, rpmsg_ns.ko and virtio_rpmsg_bus.ko drivers?

Yes, they are inserted as well. I was also seeing them under the rpmsg/devices folder. The rpmsg_ctrl and rpmsg_ns.

Sadly I don't have access to the setup, but I will provide the traces and more logs from next week onwards.

tnmysh commented 2 months ago

One more debug point. Zephyr side mostly things look good. Could you paste linux side device-tree IPI nodes ?

If devices are created successfully, then I believe name service announcement is working, and so IPI is working too.

To verify, you do cat /proc/interrupts after starting remote processor. This will print all the interrupts occured in the system. Your IPI interrupt number should show increased counts.

tnmysh commented 2 months ago

I missed that linux device-tree was already posted. Your assesment is correct, it's IPI issue.

Linux device-tree

        interrupts = <0 29 4>; // TODO: check if this is correct
        xlnx,ipi-id = <7>; // TODO: check if this is correct

Zephyr device-tree:

            rpu0_apu_mailbox: mailbox@ff990200 {
                remote-ipi-id = <0>;

So, zephyr fw is expecting APU ID to be 0, but APU ID is 7 instead.

Linux side device-tree zynqmp-ipi node with IPI ID 0 is introduced here: https://github.com/torvalds/linux/blob/e936e7d4a83b5ff6b7a685722f0ba348383af68c/arch/arm64/boot/dts/xilinx/zynqmp.dtsi#L139

and then for openamp additional nodes are added as following: https://github.com/OpenAMP/openamp-system-reference/blob/d57bae7346f2d6f0513d9eb8b22fda698d4af8e0/examples/linux/dts/xilinx/zynqmp-openamp.dtso#L61

I hope this will fix the issue.

olneumann commented 1 month ago

With the right overlay it works.

    zynqmp-ipi {
        #address-cells = <2>;
        #size-cells = <2>;
        ranges;

        ipi_mailbox_rpu0: mailbox@ff990040 {
            reg = <0x00 0xff990040 0x00 0x20>,
                  <0x00 0xff990060 0x00 0x20>,
                  <0x00 0xff990200 0x00 0x20>,
                  <0x00 0xff990220 0x00 0x20>;
            reg-names = "local_request_region",
                    "local_response_region",
                    "remote_request_region",
                    "remote_response_region";
            #mbox-cells = <0x01>;
            xlnx,ipi-id = <0x01>;
        };

        ipi_mailbox_rpu1: mailbox@ff990080 {
            reg = <0x00 0xff990080 0x00 0x20>,
                  <0x00 0xff9900a0 0x00 0x20>,
                  <0x00 0xff990400 0x00 0x20>,
                  <0x00 0xff990420 0x00 0x20>;
            reg-names = "local_request_region",
                    "local_response_region",
                    "remote_request_region",
                    "remote_response_region";
            #mbox-cells = <0x01>;
            xlnx,ipi-id = <0x02>;
        };
    };
tnmysh commented 1 month ago

Sorry I couldn't reply earlier. Returning this week from long leave. I am glad it worked!