Microsemi / switchtec-kernel

A kernel module for the Microsemi PCIe switch
GNU General Public License v2.0
45 stars 31 forks source link

ntb_perf v2.0 not working with a 2 port NTB link on kernel 5.4 #111

Open jborz27 opened 2 years ago

jborz27 commented 2 years ago

I'm seeing ntb_hw_switchtec driver not working with ntb_perf 2.0 on kernel 5.4.

According to ntb_perf documentation writing the port number to the /sys/kernel/debug/ntb_perf/[bdf]/run should start the performance test.

Here are excerpts from both the ntb_perf info parameter and dmesg logs from both hosts.

Host 1 cat /sys/kernel/debug/ntb_perf/0000\:01\:00.1/info Performance measuring tool info:

Local port 0, Global index 0 Test status: idle Port 0 (0), Global index 0: Link status: up Out buffer addr 0xffffb34e84500000 Out buffer size 0x0000000000080000 Out buffer xlat 0x0000000000000000[p] In buffer addr: unallocated

dmesg [118700.910614] ntb_perf 0000:01:00.1: Global port index 0 [118700.910619] ntb_perf 0000:01:00.1: Message service unsupported [118700.910620] ntb_perf 0000:01:00.1: Scratchpad service initialized [118700.910628] ntb_perf 0000:01:00.1: DB bits unmasked 0x1 [118700.910632] switchtec switchtec0: enabling link [118700.912932] switchtec switchtec0: message: 0 00000003 [118700.913178] switchtec switchtec0: doorbell [118700.913495] ntb_perf 0000:01:00.1: DB vec 0 mask 0xfffffff bits 0x1 [118700.915226] ntb_perf 0000:01:00.1: CMD exec: 0 [118700.915236] switchtec switchtec0: ntb link up [118700.915469] ntb_perf 0000:01:00.1: CMD send: 0 0x80000

Host 2 cat /sys/kernel/debug/ntb_perf/0000\:03\:00.1/info Performance measuring tool info:

Local port 0, Global index 0 Test status: idle Port 0 (0), Global index 0: Link status: up Out buffer addr 0xffffb272c1400000 Out buffer size 0x0000000000080000 Out buffer xlat 0x0000000000000000[p] In buffer addr 0xffff9b0def300000 In buffer size 0x0000000000080000 In buffer xlat 0x00000003ef300000[p]

dmesg [198323.821290] switchtec switchtec0: ntb link up [198323.821528] ntb_perf 0000:03:00.1: CMD send: 0 0x80000 [198323.821532] ntb_perf 0000:03:00.1: DB ring peer 0x1 [198323.823662] switchtec switchtec0: message: 0 00000003 [198323.823879] switchtec switchtec0: doorbell [198323.824210] ntb_perf 0000:03:00.1: DB vec 0 mask 0xfffffff bits 0x1 [198323.824214] ntb_perf 0000:03:00.1: CMD recv: 0 0x80000 [198323.824216] ntb_perf 0000:03:00.1: CMD exec: 1 [198323.824867] switchtec switchtec0: MW 0: part 0 addr 0x00000003ef300000 size 0x0000000000080000 [198323.944368] ntb_perf 0000:03:00.1: CMD exec: 2 [198323.944373] ntb_perf 0000:03:00.1: CMD send: 2 0x3ef300000

lsgunth commented 2 years ago

There were a bunch of fixes after v5.4 for ntb_perf. As I recall it was broken for a while in that era. You probably need to use a newer kernel.

jborz27 commented 2 years ago

Any recommendation for which kernel to use?

lsgunth commented 2 years ago

Try the latest.

jborz27 commented 2 years ago

Got ntb_perf to work on kernel 5.13: The DMA results are at default so may be able to yield even better throughput.

I did need to unload/load ntb_perf twice on one of the servers to get the in/out buffers to be setup on both ends of the NTB link.

Non-DMA Performance (Gen3 x8) cat /sys/kernel/debug/ntb_perf/0000\:03\:00.1/run Peer 0 test statistics: 0: copied 1073741824 bytes in 207997 usecs, 5162 MBytes/s (65% theoretical max)

DMA Performance (Gen3 x8) cat /sys/kernel/debug/ntb_perf/0000\:09\:00.1/run Peer 0 test statistics: 0: copied 1073741824 bytes in 156451 usecs, 6863 MBytes/s (87% theoretical max)