Xilinx / open-nic-driver

AMD OpenNIC driver includes the Linux kernel driver
GNU General Public License v2.0
58 stars 40 forks source link

insmod onic.ko hanging #28

Closed 108anup closed 2 years ago

108anup commented 2 years ago

insmod onic.ko is hanging. I see the following in dmesg log. It used to work fine earlier. Could it be issue due to interaction with an updated linux kernel?

This is using vanilla open nic shell bitstream on a U280 FPGA.

dmesg log:

[  427.355719] OpenNIC Linux Kernel Driver 0.21
[  427.356254] onic 0000:86:00.0 onic134s0f0 (uninitialized): Set MAC address to 00:0a:35:11:d0:b0
[  427.356257] onic 0000:86:00.0: device is a master PF
[  427.356538] onic 0000:86:00.0: Allocated 8 queue vectors
[  427.854578] BUG: unable to handle kernel NULL pointer dereference at 000000000000000a
[  427.857391] IP: qdma_invalidate_fmap_ctxt+0x11/0x60 [onic]
[  427.860124] PGD 0 P4D 0
[  427.862786] Oops: 0002 [#1] SMP NOPTI
[  427.865416] Modules linked in: onic(OE+) nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache ipmi_ssif intel_rapl skx_edac joydev x86_pkg_temp_thermal intel_powerclamp input_leds coretemp ftdi_sio kvm_intel usbserial kvm irqbypass xclmgmt(OE) intel_cstate xocl(OE) intel_rapl_perf fpga_mgr mei_me mei ioatdma lpc_ich shpchp acpi_power_meter ipmi_si ipmi_devintf ipmi_msghandler mac_hid acpi_pad sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi parport_pc ppdev lp parport sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel usbhid hid pcbc aesni_intel
[  427.884024]  aes_x86_64 crypto_simd glue_helper cryptd ast i2c_algo_bit ttm drm_kms_helper ixgbe syscopyarea sysfillrect sysimgblt fb_sys_fops dca ptp drm pps_core mdio ahci libahci wmi [last unloaded: onic]
[  427.889334] CPU: 16 PID: 108 Comm: kworker/16:0 Tainted: G           OE    4.15.0-177-generic #186-Ubuntu
[  427.891975] Hardware name: Supermicro SYS-2029GP-TR/X11DPG-SN, BIOS 3.4 12/18/2020
[  427.894599] Workqueue: events work_for_cpu_fn
[  427.897189] RIP: 0010:qdma_invalidate_fmap_ctxt+0x11/0x60 [onic]
[  427.899749] RSP: 0018:ffffaedb0cc27d58 EFLAGS: 00010282
[  427.902266] RAX: 00000000fffffff0 RBX: 0000000000000000 RCX: 0000000000000000
[  427.904767] RDX: 0000000000000000 RSI: ffffaedb0f311000 RDI: 0000000000000000
[  427.907223] RBP: ffffaedb0cc27d68 R08: ffff9ef7e0718480 R09: ffffaedb0cc27be0
[  427.909656] R10: fffffffffffc0000 R11: ffffcedaffffffff R12: ffff9ef7dd81a8c0
[  427.912049] R13: ffff9ef7f0640000 R14: 0000000000000000 R15: 0000000000000000
[  427.914412] FS:  0000000000000000(0000) GS:ffff9ef7fee00000(0000) knlGS:0000000000000000
[  427.916754] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  427.919057] CR2: 000000000000000a CR3: 000000154d00a006 CR4: 00000000007606e0
[  427.921340] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  427.923578] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  427.925770] PKRU: 55555554
[  427.927913] Call Trace:
[  427.930022]  onic_init_hardware+0x116/0x900 [onic]
[  427.932103]  onic_probe+0x2b8/0x4f0 [onic]
[  427.934142]  local_pci_probe+0x47/0xa0
[  427.936135]  work_for_cpu_fn+0x1a/0x30
[  427.938083]  process_one_work+0x1de/0x420
[  427.939991]  worker_thread+0x228/0x410
[  427.941854]  kthread+0x121/0x140
[  427.943668]  ? process_one_work+0x420/0x420
[  427.945450]  ? kthread_create_worker_on_cpu+0x70/0x70
[  427.947201]  ret_from_fork+0x1f/0x40
[  427.948904] Code: 65 48 33 14 25 28 00 00 00 75 02 c9 c3 e8 e8 57 15 c4 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 31 c9 31 d2 48 89 e5 48 83 ec 10 <c7> 47 0a 00 00 00 00 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31
[  427.952366] RIP: qdma_invalidate_fmap_ctxt+0x11/0x60 [onic] RSP: ffffaedb0cc27d58
[  427.954050] CR2: 000000000000000a
[  427.955688] ---[ end trace 8a5a28677ac3f2ee ]---
cneely-amd commented 2 years ago

@108anup , thanks, someone reported a similar error message recently, but I haven't encountered this myself. What version of the kernel were you using earlier successfully?
What version of the kernel seems like it is no longer working?

108anup commented 2 years ago

The driver is not working on either of the following kernels: 4.15.0-177-generic and 4.15.0-176-generic on Ubuntu 28.04.

I don't think the kernel has changed actually between past and now. So I would imagine, the working kernel would also be 4.15.0-176-generic. But I can't confirm as I don't know what was the kernel last time it ran properly.

I have tried both U50 and U280 boards on a Supermicro SYS-2029GP-TR/X11DPG-SN server.

108anup commented 2 years ago

Does the open nic driver need to be loaded to use pcimem? If not, I can't read any address from the device over PCIe using pcimem as well. This is for both U50/U280 using vanilla open nic shell.

Example:

$> sudo $PCIMEM /sys/bus/pci/devices/$EXTENDED_DEVICE_BDF1/resource2 0x10400
/sys/bus/pci/devices/0000:3b:00.0/resource2 opened.
Target offset is 0x10400, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x10400)
PCI Memory mapped to address 0x7f2a49067000.
0x10400: 0xFFFFFFFF

This address ideally returns the temperature.

cneely-amd commented 2 years ago

The driver doesn't need to be loaded to use pcimem. However, on some kernel versions to use pcimem or similar, I need to run something like: sudo setpci -s 0a:00.0 COMMAND=0x02;

I'm curious though is it that you can't use pcimem before attempting to load/install onic.ko or afterwards? If that somehow loading the driver first is causing your issue with pcimem.

108anup commented 2 years ago

Even before trying to load the driver, I am unable to read registers using pcimem.

Steps: On a fresh cold reboot:

  1. I remove the pcie device: echo 1 | sudo tee "/sys/bus/pci/devices/${bridge_bdf}/${EXTENDED_DEVICE_BDF1}/remove" > /dev/null
  2. Load the bitstream.
  3. Rescan: echo 1 | sudo tee "/sys/bus/pci/devices/${bridge_bdf}/rescan" > /dev/null
  4. sudo setpci -s $EXTENDED_DEVICE_BDF1 COMMAND=0x02

After this when I try pcimem, it shows 0xFFFFFFFF irrespective of the address.

Since pcimem should work even without driver. I am guessing that the inability to read from pcimem might be causing insmod to crash. I am not sure though why pcimem might not be working with the vanilla bitstream. This has worked for me before.

108anup commented 2 years ago

Okay, I actually also had to warm reboot after step 4 above. I think a warm reboot is needed whenever the FPGA changes from non open nic bitstream to open nic. After cold reboot, the image would be reset to the golden platform e.g., xilinx_u280_xdma_201920_3 for U280.

After warm reboot both pcimem and driver loading work.

I might have done a cold reboot since the last time I used open nic and forgot to do a warm reboot after loading open nic bitstream.

manwu1994 commented 1 year ago

Thank so much for your previous suggestions. I followed the mentioned four steps, and warm reboot it. U250 still can not show the interface through ifconfig. Furthermore, when I warm reboot, the onic module will be disapear and has to insmod again. The onic kernel can also not be hanging. Could you please talk me any suggestions? Thank you so much in advance.

65:00.0 Network controller: Xilinx Corporation Device 903f
        Subsystem: Xilinx Corporation Device 0007
        Flags: bus master, fast devsel, latency 0, IRQ 124, NUMA node 0
        Memory at e0c40000 (64-bit, non-prefetchable) [size=256K]
        Memory at e0800000 (64-bit, non-prefetchable) [size=4M]
        Capabilities: <access denied>
        Kernel driver in use: qdma-pf
        Kernel modules: xdma, qdma_pf

65:00.1 Network controller: Xilinx Corporation Device 913f
        Subsystem: Xilinx Corporation Device 0007
        Flags: bus master, fast devsel, latency 0, IRQ 124, NUMA node 0
        Memory at e0c00000 (64-bit, non-prefetchable) [size=256K]
        Memory at e0400000 (64-bit, non-prefetchable) [size=4M]
        Capabilities: <access denied>
        Kernel driver in use: qdma-pf
        Kernel modules: qdma_pf
108anup commented 1 year ago

Have you loaded the onic bitstream using the Xilinx Vivado toolchain?

Once you do that and do a warm reboot, in lspci -vvv, the FPGA should show up as a "Memory Controller" instead of a "Network Controller". Then when you insmod the onic.ko kernel module, then lspci -vvv should also show the kernel module as onic.

Reference lspci -vvv output:

86:00.0 Memory controller: Xilinx Corporation Device 903f
        Subsystem: Xilinx Corporation Device 0007
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 199
        NUMA node: 1
        Region 0: Memory at e0c00000 (64-bit, non-prefetchable) [size=256K]
        Region 2: Memory at e0400000 (64-bit, non-prefetchable) [size=4M]
        Capabilities: <access denied>
        Kernel driver in use: onic

86:00.1 Memory controller: Xilinx Corporation Device 913f
        Subsystem: Xilinx Corporation Device 0007
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 199
        NUMA node: 1
        Region 0: Memory at e0c40000 (64-bit, non-prefetchable) [size=256K]
        Region 2: Memory at e0800000 (64-bit, non-prefetchable) [size=4M]
        Capabilities: <access denied>
        Kernel driver in use: onic
cneely-amd commented 1 year ago

There was a pull request that we merged earlier this year that changed it from "Memory Controller" into a "Network Controller".

Two suggestions that are probably important are:

  1. First confirm that the FPGA bit file meets the necessary timing constraints by opening the project file for the design within Vivado's GUI and check the worst negative slack WNS. Also, please note that the design needs to be built with Vivado 2022.1 related to QDMA IP (Vivado 2022.2 and Vivado 2023.1 introduce a major version change to the QDMA IP that doesn't seem to be compatible with this design yet. If you build it with an old version of Vivado just don't upgrade the QDMA IP if you open within a newer Vivado.). If the implementation results don't meet the timing, change the vivado implementation strategy to something like "Performance Explore" or "Performance1" options to improve the place and route effort, set the new implementation settings to "active", and rebuild.

  2. After the bit file meets timing, the order should be: 1) load bit file on FPGA, 2) warm reboot, and 3) run insmod ... 4) run ifconfig

If it doesn't appear here are some suggestions: 5) try checking the link status with pcimem or similar by reading the CMAC status registers:

#writes to enable the CMAC (adjust for your PCI BDF resource path)
sudo ~cneely/pcimem/pcimem /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/resource2 0x8014 w 0x1;
sudo ~cneely/pcimem/pcimem /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/resource2 0x800c w 0x1;
# read the link status, two reads are necessary.  The second read should be 0x3 if you have link
sudo ~cneely/pcimem/pcimem /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/resource2 0x8204;
sudo ~cneely/pcimem/pcimem /sys/devices/pci0000:00/0000:00:03.2/0000:0b:00.0/resource2 0x8204;

5) try running ip link show to see the adapter name and number , e.g.

5: enp11s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:35:ad:bf:c8 brd ff:ff:ff:ff:ff:ff
6: enp11s0f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 00:0a:35:73:31:40 brd ff:ff:ff:ff:ff:ff
  1. maybe assign a static IP using e.g. netplan. create a yaml file for netplan within /etc/netplan/

    #U250
    network:
    version: 2
    renderer: networkd
    ethernets:
    # in this example my other ethernet device is enp40, which needs dhcp
    enp4s0:
      dhcp4: yes
      dhcp6: yes
      addresses: [192.168.1.109/24]
    
    # in this case my U250 open-nic-shell interfaces are below
    enp11s0f0:
      dhcp4: no
      dhcp6: no
      addresses: [192.168.20.4/24] #this is just what I used for my testing
      #...
manwu1994 commented 1 year ago

Thank you so muh for your reply. I followed your mentioned four steps. Our vivado version is 2021.2 and the WNS is 0.034. lspci -vd 10ee:

65:00.0 Network controller: Xilinx Corporation Device 903f
        Subsystem: Xilinx Corporation Device 0007
        Flags: fast devsel, IRQ 124, NUMA node 0
        Memory at e0c40000 (64-bit, non-prefetchable) [virtual] [size=256K]
        Memory at e0800000 (64-bit, non-prefetchable) [virtual] [size=4M]
        Capabilities: [40] Power Management version 3
        Capabilities: [60] MSI-X: Enable- Count=10 Masked-
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [1c0] Secondary PCI Express
        Capabilities: [200] Virtual Channel
        Kernel driver in use: qdma-pf
        Kernel modules: xdma, qdma_pf

65:00.1 Network controller: Xilinx Corporation Device 913f
        Subsystem: Xilinx Corporation Device 0007
        Flags: fast devsel, IRQ 124, NUMA node 0
        Memory at e0c00000 (64-bit, non-prefetchable) [virtual] [size=256K]
        Memory at e0400000 (64-bit, non-prefetchable) [virtual] [size=4M]
        Capabilities: [40] Power Management version 3
        Capabilities: [60] MSI-X: Enable- Count=9 Masked-
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Kernel driver in use: qdma-pf
        Kernel modules: qdma_pf

>try checking the link status with pcimem or similar by reading the CMAC status registers: I checked it, there are correct.

Then I continue to check the onic.io kernel via lsmod, which aslo installed in host.

Module                  Size  Used by
ftdi_sio               61440  1
onic                  118784  0

Finally, I still can not the netwrok interface via ifconfig.

I think the problem might be the onic.io insmod. For my case, I can not hanging on onic.io on kernel driver. But I have no ideal how to continue, if you have any suggestions, please provide to me. If you need any other infomation, please talk to me. And much appreaciate for your time and suggestions.

Best regards~

cneely-amd commented 1 year ago

I suspect the issue is due to xdma and qdma_pf kernel modules being loaded. Can you temporarily blacklist or not load those modules when trying onic module?

manwu1994 commented 1 year ago

I suspect the issue is due to xdma and qdma_pf kernel modules being loaded. Can you temporarily blacklist or not load those modules when trying onic module?

That's truth. I blacklist xdma and qdma_pf kernel modules, and onic kernel can suffessfully use, and thus I can get the interface of U250. Thank you so much. Additionally, is there any solution to make these three kernels compatible? Thank you so much again.

manwu1994 commented 1 year ago

Sorry, another quesstion, when I warm reboot the FPGA (or host), the onic disppears and insomd again. Is it normal for the onic driver? Thank you so much again.

wnew commented 1 year ago

Sorry, another quesstion, when I warm reboot the FPGA (or host), the onic disppears and insomd again. Is it normal for the onic driver? Thank you so much again.

@manwu1994 in case you havent resolved this yet, kernel modules loaded with insmod are not permanent. To load kernel modules at boot see here: https://www.cyberciti.biz/faq/linux-how-to-load-a-kernel-module-automatically-at-boot-time/

wnew commented 1 year ago

We have managed to narrow this issue down to the memory range of the BARs and/or the prefetch setting in the QDMA IP. Ill update once I have a clearer picture of what is going on.

@cneely-amd do you know why the prefetch is not enabled by default in the open NIC designs? Is there any reason we cant enable it?

cneely-amd commented 1 year ago

@wnew I don't know what the tradeoffs would be for enabling/using prefetch vs. without. I've been maintaining the OpenNIC shell and drivers, but I didn't create the original designs, and so some of the reasoning behind certain design choices I don't know.
Are you experimenting with prefetch? --Chris