Xilinx / open-nic-shell

AMD OpenNIC Shell includes the HDL source files
Apache License 2.0
94 stars 63 forks source link

The link cannot be established on U200. #6

Closed TimmonSha closed 2 years ago

TimmonSha commented 2 years ago

I built the project with the command vivado -mode batch -source build.tcl -tclargs -board_repo ../XilinxBoardStore-master/ -board au200 -jobs 32 -num_phys_func 2 -num_cmac_port 2. And the project was built successfully. I programmed the device, insert the driver, and connected two ports directly with a QSFP28 DAC cable. But the link cannot be established. Then I read the register STAT_RX_STATUS_REG of cmac with pcimem, it shows:

[root@localhost pcimem]# ./pcimem /sys/bus/pci/devices/0000:05:00.0/resource2 0x8204 w
/sys/bus/pci/devices/0000:05:00.0/resource2 opened.
Target offset is 0x8204, page size is 4096
mmap(0, 4096, 0x3, 0x1, 3, 0x8204)
PCI Memory mapped to address 0x7fab473ce000.
**_0x8204: 0x000000C0_**

According to the cmac manual, that means stat_rx_local_fault=1, stat_rx_internal_local_fault=1. I am sure that the DAC cable has no problem. How can I solve this problem? Thank you wery much.

cneely-amd commented 2 years ago

Hi, I'm going to make three suggestions based on my experience with using a U250. I don't have a U200 to try.

  1. Try reading the status register twice. The RX status from the CMAC is often not correct the very first time, so in my example scripts I always read and then read again. For example, I might typically read 0xE0 or 0xC0 the first time. If the connection is good, then the second time I will get status=0x03. Otherwise, 0xC0 typically can indicate the cable is not working or not well connected.
  2. I highly suspect that the issue that you are hitting is that RS_FEC hasn't been disabled (it is enabled by default). I need to disable this for direct connections between the ports on the Alveo. To disable RS_FEC, when loading the kernel driver load it as "sudo insmod onic.ko RS_FEC_ENABLED=0". Please try reading the status register before loading the kernel driver and after loading the kernel driver to check if the status changes.
  3. Maybe a third suggestion is to try to running a test using the CMAC serdes loopback, which can be enabled by writing both 0x1 to 0x8090 and 0x1 to 0xC090 to configure both CMAC ports. Although, I know this is not the same type of test as what you were trying with the direct cable connection. Best regards, --Chris
TimmonSha commented 2 years ago

Thank you for your reply. We connected U200 with ixia tester, and it established the link.