Mellanox / scalablefunctions

All about Scalable functions
39 stars 2 forks source link

SF support for rte_flow offloading #6

Open byteocean opened 1 year ago

byteocean commented 1 year ago

Hi community and authors,

As in the official document [1] states, SFs support E-Switch representation offload like existing PF and VF representors. However, a simple rte_flow rule cannot be installed in a working SF setup (with Mellanox ConnectX-6 Dx). The installed the flow rule is simple:

flow create 1 ingress transfer pattern eth / end actions port_id id 0 / end

In this flow rule, port 1 is the representor of a PF and port 0 is the representor of a created SF. The following error was returned back:

port_flow_complain(): Caught PMD error type 15 (action configuration): cause: 0x7ffc5abd74a8, failed to obtain E-Switch port id for port: Invalid argument

testpmd app is started with the EAL parameters -a 0000:3b:00.0 -a auxiliary:mlx5_core.sf.21.

So, it would be appreciated if someone could give a hint on the following questions

  1. to which extend, offloading is supported with SF?
  2. if offloading is well supported as for VF, what could be the reason such error is returned.

Thanks in advance.

[1] https://docs.nvidia.com/doca/archive/doca-v1.1/pdf/scalable-functions.pdf

paravmellanox commented 1 year ago

@byteocean , thank you for the report. I have seen dpdk eswitch was working fine. Please consult dpdk community for the error with exact dpdk version as this error is surfacing from dpdk or please consult Nvidia support channel to get it resolved.

byteocean commented 1 year ago

@paravmellanox thanks a lot for your quick response. In my experiment, after configuring SF, its representor can be seen via the dpdk's user tool:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.11.0", "pid": 38082, "max_output_len": 16384}
Connected to application: "dp_service"
--> /ethdev/info,2
{"/ethdev/info": {"name": "mlx5_core.sf.2", "state": 1, "nb_rx_queues": 2, "nb_tx_queues": 2, "port_id": 2, "mtu": 1500, "rx_mbuf_size_min": 2176, "rx_mbuf_alloc_fail": 0, "mac_addr": "00:00", "promiscuous": 1, "scattered_rx": 0, "all_multicast": 0, "dev_started": 1, "lro": 0, "dev_configured": 1, "rxq_state": [1, 0], "txq_state": [1, 0], "numa_node": 0, "dev_flags": 75, "rx_offloads": 0, "tx_offloads": 14, "ethdev_rss_hf": 41868}}

Also sending and receiving traffic via a customised dpdk application without hardware offloading are also working, but enabling offloading fails as described above.

I will investigate further on the dpdk side. But, due to the lack of documentation regarding the combination use of SF and offloading, could you confirm that, in general, offloading on SF could work well as on VF/PF? Do you have such experience to perform offloading on SFs as well?

Thanks in advance.

paravmellanox commented 1 year ago

Hi @byteocean ,

@paravmellanox thanks a lot for your quick response. In my experiment, after configuring SF, its representor can be seen via the dpdk's user tool:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.11.0", "pid": 38082, "max_output_len": 16384}
Connected to application: "dp_service"
--> /ethdev/info,2
{"/ethdev/info": {"name": "mlx5_core.sf.2", "state": 1, "nb_rx_queues": 2, "nb_tx_queues": 2, "port_id": 2, "mtu": 1500, "rx_mbuf_size_min": 2176, "rx_mbuf_alloc_fail": 0, "mac_addr": "00:00", "promiscuous": 1, "scattered_rx": 0, "all_multicast": 0, "dev_started": 1, "lro": 0, "dev_configured": 1, "rxq_state": [1, 0], "txq_state": [1, 0], "numa_node": 0, "dev_flags": 75, "rx_offloads": 0, "tx_offloads": 14, "ethdev_rss_hf": 41868}}

Also sending and receiving traffic via a customised dpdk application without hardware offloading are also working, but enabling offloading fails as described above.

It is strange to me.

I will investigate further on the dpdk side. But, due to the lack of documentation regarding the combination use of SF and offloading, could you confirm that, in general, offloading on SF could work well as on VF/PF? Do you have such experience to perform offloading on SFs as well?

Yes, it should work and it is tested as well by other customers too in field. The DPU users also use the DPDK PMD with eswitch to use the SF representors.

Thanks in advance.

byteocean commented 1 year ago

Hi @paravmellanox thanks again for the reply.

I am no expert in Mellanox drivers, but I tried to dump some information from DPDK's Mellanox part to figure out what could go wrong.

It seems that this line failed due to the sf port's flag master or representor is not set. Their values seem to be set by this function, and its comment says "no representor support".

Would you happen to know any additional parameters needed to provide as EAL's starting parameters, based on the above description?

Thanks in advance.

paravmellanox commented 1 year ago

@byteocean , I do not know. Please reach out to Nvidia support to resolve it. You may find something in dpdk documentation as well.