Open wangjun0728 opened 4 months ago
Hi , @wangjun0728 .
if_descr="DPDK 22.11.1 net_ice"
Please, try the newer 22.11. There are numerous fixes in drivers between 22.11.1 and 22.11.4.
Hi @igsilya ,I attempted to update DPDK to version 22.11.4, but the same error persists.
E810:
{bus_info="bus_name=pci, vendor_id=8086, device_id=159b", driver_name=net_ice, if_descr="DPDK 22.11.4 net_ice", if_type="6", link_speed="25Gbps", max_hash_mac_addrs="0", max_mac_addrs="64", max_rx_pktlen="1618", max_rx_queues="256", max_tx_queues="256", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="1", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="true", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
error:
2024-03-05T02:12:53.092Z|00050|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-05T02:12:55.112Z|00051|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-05T02:13:08.027Z|00052|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-05T02:13:14.458Z|00478|connmgr|INFO|br-int<->unix#3: 5 flow_mods 18 s ago (3 adds, 2 deletes) 2024-03-05T02:13:38.871Z|00053|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-05T02:14:39.946Z|00054|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-05T02:15:05.262Z|00055|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
82599:
{bus_info="bus_name=pci, vendor_id=8086, device_id=10fb", driver_name=net_ixgbe, if_descr="DPDK 22.11.4 net_ixgbe", if_type="6", link_speed="10Gbps", max_hash_mac_addrs="4096", max_mac_addrs="127", max_rx_pktlen="1618", max_rx_queues="128", max_tx_queues="64", max_vfs="0", max_vmdq_pools="64", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="0", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
error:
2024-03-05T02:16:29.189Z|00414|netdev_dpdk|WARN|Dropped 1 log messages in last 29 seconds (most recently, 29 seconds ago) due to excessive rate 2024-03-05T02:16:29.189Z|00415|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/2 are valid: Operation not supported 2024-03-05T02:17:00.568Z|00023|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:00.568Z|00024|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:05.573Z|00025|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:05.573Z|00026|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:10.578Z|00027|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:20.589Z|00028|netdev_dpdk(pmd-c02/id:87)|WARN|Dropped 3 log messages in last 10 seconds (most recently, 5 seconds ago) due to excessive rate 2024-03-05T02:17:20.589Z|00029|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:35.604Z|00030|netdev_dpdk(pmd-c02/id:87)|WARN|Dropped 5 log messages in last 15 seconds (most recently, 5 seconds ago) due to excessive rate 2024-03-05T02:17:35.604Z|00031|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T02:17:44.560Z|00416|netdev_dpdk|WARN|Dropped 4 log messages in last 9 seconds (most recently, 3 seconds ago) due to excessive rate 2024-03-05T02:17:44.560Z|00417|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported
OK. I don't really know what could be wrong with an ice driver and I don't have any hardware to test with. The only suggestion here will be to try and update the firmware on the card in case you're not using the latest version.
For the other driver we can try to debug that, but we need to know how these invalid packets look like.
I prepared a small patch that would dump the invalid packets to the OVS log here: https://github.com/igsilya/ovs/commit/3c34e86483941b39b64f831818d05cefd618c8a8
Could you try it in your setup? You'll need to enable debug logging for netdev_dpdk
module in order to see the dump.
The output should look something like this:
2024-03-05T14:18:48.161Z|00012|netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: Invalid packet:
dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
pkt_len=90, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
Dump data at [0x1180bce580], len=64
00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 33.....'..M...`.
00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | ...$............
00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | ................
00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | ......:.........
Also, what OVS version are you using? Maybe worth trying to update to the latest stable releases if you're not using them already.
Hi, @igsilya thank you very much for your reply. The output log after I tried using your patch is as follows:
2024-03-05T15:42:58.817Z|00012|netdev_dpdk(pmd-c02/id:87)|DBG|tun_port_p1: Invalid packet: dump mbuf at 0x192764ec0, iova=0x192765180, buf_len=2176 pkt_len=144, ol_flags=0x800800000000182, nb_segs=1, port=65535, ptype=0 segment at 0x192764ec0, data=0x1927651c2, len=144, off=66, refcnt=1 Dump data at [0x1927651c2], len=64 00000000: 40 A6 B7 21 92 8C 68 91 D0 65 C6 C3 81 00 00 5C | @..!..h..e.....\ 00000010: 08 00 45 00 00 7E 00 00 40 00 40 11 D8 07 0A FD | ..E..~..@.@..... 00000020: 26 38 0A FD 26 36 C6 E1 17 C1 00 6A AD 0D 02 40 | &8..&6.....j...@ 00000030: 65 58 00 00 32 00 01 02 80 01 00 02 00 03 0E 9C | eX..2........... 2024-03-05T15:42:58.817Z|00013|netdev_dpdk(pmd-c02/id:87)|DBG|tun_port_p1: Invalid packet: dump mbuf at 0x192764ec0, iova=0x192765180, buf_len=2176 pkt_len=144, ol_flags=0x800800000000182, nb_segs=1, port=65535, ptype=0 segment at 0x192764ec0, data=0x1927651c2, len=144, off=66, refcnt=1 Dump data at [0x1927651c2], len=64 00000000: 40 A6 B7 21 92 8C 68 91 D0 65 C6 C3 81 00 00 5C | @..!..h..e.....\ 00000010: 08 00 45 00 00 7E 00 00 40 00 40 11 D8 07 0A FD | ..E..~..@.@..... 00000020: 26 38 0A FD 26 36 C6 E1 17 C1 00 6A AD 0D 02 40 | &8..&6.....j...@ 00000030: 65 58 00 00 32 00 01 02 80 01 00 02 00 03 0E 9C | eX..2........... 2024-03-05T15:43:03.823Z|00014|netdev_dpdk(pmd-c02/id:87)|DBG|tun_port_p1: Invalid packet: dump mbuf at 0x192775d00, iova=0x192775fc0, buf_len=2176 pkt_len=144, ol_flags=0x800800000000182, nb_segs=1, port=65535, ptype=0 segment at 0x192775d00, data=0x192776002, len=144, off=66, refcnt=1 Dump data at [0x192776002], len=64 00000000: 6C FE 54 2F 0D C0 68 91 D0 65 C6 C3 81 00 00 5C | l.T/..h..e.....\ 00000010: 08 00 45 00 00 7E 00 00 40 00 40 11 D8 04 0A FD | ..E..~..@.@..... 00000020: 26 38 0A FD 26 39 8A 56 17 C1 00 6A D6 C0 02 40 | &8..&9.V...j...@ 00000030: 65 58 00 00 32 00 01 02 80 01 00 02 00 04 0A 35 | eX..2..........5 2024-03-05T15:43:03.823Z|00015|netdev_dpdk(pmd-c02/id:87)|DBG|tun_port_p1: Invalid packet: dump mbuf at 0x192775d00, iova=0x192775fc0, buf_len=2176 pkt_len=144, ol_flags=0x800800000000182, nb_segs=1, port=65535, ptype=0 segment at 0x192775d00, data=0x192776002, len=144, off=66, refcnt=1 Dump data at [0x192776002], len=64 00000000: 6C FE 54 2F 0D C0 68 91 D0 65 C6 C3 81 00 00 5C | l.T/..h..e.....\ 00000010: 08 00 45 00 00 7E 00 00 40 00 40 11 D8 04 0A FD | ..E..~..@.@..... 00000020: 26 38 0A FD 26 39 8A 56 17 C1 00 6A D6 C0 02 40 | &8..&9.V...j...@ 00000030: 65 58 00 00 32 00 01 02 80 01 00 02 00 04 0A 35 | eX..2..........5 2024-03-05T15:43:08.828Z|00016|netdev_dpdk(pmd-c02/id:87)|WARN|Dropped 3 log messages in last 10 seconds (most recently, 5 seconds ago) due to excessive rate 2024-03-05T15:43:08.828Z|00017|netdev_dpdk(pmd-c02/id:87)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-05T15:43:08.828Z|00018|netdev_dpdk(pmd-c02/id:87)|DBG|tun_port_p1: Invalid packet: dump mbuf at 0x192781900, iova=0x192781bc0, buf_len=2176 pkt_len=144, ol_flags=0x800800000000182, nb_segs=1, port=65535, ptype=0 segment at 0x192781900, data=0x192781c02, len=144, off=66, refcnt=1 Dump data at [0x192781c02], len=64 00000000: 40 A6 B7 21 92 8C 68 91 D0 65 C6 C3 81 00 00 5C | @..!..h..e.....\ 00000010: 08 00 45 00 00 7E 00 00 40 00 40 11 D8 07 0A FD | ..E..~..@.@..... 00000020: 26 38 0A FD 26 36 C6 E1 17 C1 00 6A AD 0D 02 40 | &8..&6.....j...@ 00000030: 65 58 00 00 32 00 01 02 80 01 00 02 00 03 0E 9C | eX..2...........
Additionally, the OVS version I'm using is 2.17.5lts. However, now when I debug after merging the changes related to checksum and TSO, I encounter this issue. It was fine before the merge, and the main changes merged are as follows. However, it's not easy for me to fully upgrade OVS because I rely on the version of OVN. https://patchwork.ozlabs.org/project/openvswitch/list/?series=&submitter=82705&state=3&q=&archive=both&delegate=
However, it's not easy for me to fully upgrade OVS because I rely on the version of OVN.
This should not be a problem. You should be able to upgrade OVS and OVN should still work just fine. The version of OVS you build OVN with and the one that you're using in runtime don't need to be the same. There is a build time dependency, because OVN is using some of the OVS libraries, but there is no runtime dependency because communication between OVS and OVN is happening over OpenFlow or OVSDB, which are stable protocols. Any version of OVN should be able to work with any version of OVS in runtime.
So, you can build OVN with the version of OVS shipped in a submodule and use a separate newer version of OVS deployed on a host. Assuming you're using static linking, there should be no issues. In fact, that is a recommended way of using OVS with OVN.
The checksum offloading patches had a lot of small issues, so I would not be surprised if some of the fixes got lost in backporting. I'll try to look at the dumps, but I'd still recommend you to just upgrade OVS on the node instead.
ol_flags=0x800800000000182
So, these are Geneve packets and the offload is requested for the outer IPv4 checksum.
Tunnel offloads were introduced in OVS 3.3, meaning they were not tested with DPDK older than 23.11. I would not be surprised that divers are missing some support or fixes. I don't think it makes sense to investigate this issue any further and I highly recommend you to just upgrade OVS and use it with supported version of DPDK.
Hi @igsilya ,I do understand the usage scenario of geneve messages. Currently, the 82599 network card does not support offload the outer IP checksum and outer UDP checksum. Thank you very much for your suggestion. I will try the latest version of OVS 3.3 as soon as possible and provide a verification reply as soon as possible. Thank you again for your reply.
Hi @igsilya ,I have completed the upgrade from OVS version to 3.3 and DPDK version to 23.11, but the same issue still exists。
E810:
`2024-03-07T07:42:56.712Z|00341|dpdk|INFO|VHOST_CONFIG: (/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock) read message VHOST_USER_SET_VRING_ENABLE 2024-03-07T07:42:56.712Z|00342|dpdk|INFO|VHOST_CONFIG: (/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock) set queue enable: 1 to qp idx: 6 2024-03-07T07:42:56.712Z|00343|dpdk|INFO|VHOST_CONFIG: (/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock) read message VHOST_USER_SET_VRING_ENABLE 2024-03-07T07:42:56.712Z|00344|dpdk|INFO|VHOST_CONFIG: (/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock) set queue enable: 1 to qp idx: 7 2024-03-07T07:42:56.722Z|00017|netdev_dpdk(ovs_vhost2)|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00018|netdev_dpdk(ovs_vhost2)|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'disabled' 2024-03-07T07:42:56.722Z|00019|netdev_dpdk(ovs_vhost2)|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00020|netdev_dpdk(ovs_vhost2)|INFO|State of queue 1 ( rx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00021|netdev_dpdk(ovs_vhost2)|INFO|State of queue 1 ( rx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'disabled' 2024-03-07T07:42:56.722Z|00022|netdev_dpdk(ovs_vhost2)|INFO|State of queue 1 ( rx_qid 0 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00023|netdev_dpdk(ovs_vhost2)|INFO|State of queue 2 ( tx_qid 1 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00024|netdev_dpdk(ovs_vhost2)|INFO|State of queue 3 ( rx_qid 1 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00025|netdev_dpdk(ovs_vhost2)|INFO|State of queue 4 ( tx_qid 2 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00026|netdev_dpdk(ovs_vhost2)|INFO|State of queue 5 ( rx_qid 2 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00027|netdev_dpdk(ovs_vhost2)|INFO|State of queue 6 ( tx_qid 3 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:56.722Z|00028|netdev_dpdk(ovs_vhost2)|INFO|State of queue 7 ( rx_qid 3 ) of vhost device '/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock' changed to 'enabled' 2024-03-07T07:42:59.383Z|00016|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:00.800Z|00017|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:00.803Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:00.810Z|00019|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:00.970Z|00020|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:01.255Z|00021|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:01.426Z|00022|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:01.682Z|00023|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:02.810Z|00024|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:03.272Z|00025|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:04.676Z|00026|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:04.810Z|00027|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:05.291Z|00028|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:07.325Z|00029|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:09.348Z|00030|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:11.351Z|00031|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:12.414Z|00032|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:13.361Z|00033|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:15.371Z|00034|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:27.544Z|00035|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-07T07:43:36.076Z|00504|connmgr|INFO|br-int<->unix#2: 5 flow_mods 32 s ago (2 adds, 3 deletes) 2024-03-07T07:43:57.440Z|00036|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
ovs-vsctl list open _uuid : 85f32857-8cfb-4f91-9ffe-e28acb930545 bridges : [442c3a80-1b82-4670-aea5-e03d9d4b8b73, ffc69315-36f9-4dd3-b5f5-1dd2118aca21] cur_cfg : 62 datapath_types : [netdev, system] datapaths : {netdev=c2425cab-fc67-47fc-96cc-17cd7675ca91, system=45cef88b-7a8d-4f23-852a-f12131577982} db_version : "8.5.0" dpdk_initialized : true dpdk_version : "DPDK 23.11.0" external_ids : {hostname=xc03-compute2, ovn-bridge-datapath-type=netdev, ovn-encap-ip="10.253.38.55", ovn-encap-type=geneve, ovn-remote="tcp:[10.253.38.10]:6642,tcp:[10.253.38.9]:6642,tcp:[10.253.38.5]:6642", rundir="/var/run/openvswitch", system-id=xc03-compute2} iface_types : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, srv6, stt, system, tap, vxlan] manager_options : [] next_cfg : 62 other_config : {bundle-idle-timeout="3600", dpdk-extra=" -a 0000:af:00.1 -a 0000:af:00.0", dpdk-init="true", dpdk-socket-mem="2048", n-handler-threads="1", pmd-cpu-mask="0xf", vlan-limit="0"} ovs_version : "3.3.1" ssl : [] statistics : {} system_type : cclinux system_version : "22.09.2"
ovs-vsctl get interface tun_port_p0 status {bus_info="bus_name=pci, vendor_id=8086, device_id=159b", driver_name=net_ice, if_descr="DPDK 23.11.0 net_ice", if_type="6", link_speed="25Gbps", max_hash_mac_addrs="0", max_mac_addrs="64", max_rx_pktlen="1618", max_rx_queues="256", max_tx_queues="256", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="1", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="true", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"} ovs-vsctl get interface vh-userclient-8d1fca5d-dc status {features="0x000000017060a783", mode=client, n_rxq="4", n_txq="4", num_of_vrings="8", numa="0", socket="/var/run/openvswitch/vh-userclient-8d1fca5d-dc-vhostuser.sock", status=connected, tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false", vring_0_size="1024", vring_1_size="1024", vring_2_size="1024", vring_3_size="1024", vring_4_size="1024", vring_5_size="1024", vring_6_size="1024", vring_7_size="1024"}`
82599:
`2024-03-07T07:46:37.430Z|00002|netdev_dpdk(pmd-c02/id:88)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-07T07:46:45.037Z|00002|netdev_dpdk(pmd-c03/id:86)|WARN|Dropped 21 log messages in last 8 seconds (most recently, 2 seconds ago) due to excessive rate 2024-03-07T07:46:45.037Z|00003|netdev_dpdk(pmd-c03/id:86)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-07T07:46:57.483Z|00002|netdev_dpdk(pmd-c00/id:89)|WARN|Dropped 9 log messages in last 12 seconds (most recently, 5 seconds ago) due to excessive rate 2024-03-07T07:46:57.483Z|00003|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported
ovs-vsctl list open _uuid : 79b87ec7-4b02-4a77-a2c1-3943a68e8f79 bridges : [ab028efc-5f0a-48d4-a7aa-515681ba1c46, c2ecaf85-9a1b-4f9d-9a51-7e136737e3f7] cur_cfg : 55 datapath_types : [netdev, system] datapaths : {netdev=0e62f217-661e-46e3-906d-74a2eef05a3e, system=2a42c035-41fd-4727-b487-ee290a7f7f7c} db_version : "8.5.0" dpdk_initialized : true dpdk_version : "DPDK 23.11.0" external_ids : {hostname=xc03-compute3, ovn-bridge-datapath-type=netdev, ovn-encap-ip="10.253.38.56", ovn-encap-type=geneve, ovn-remote="tcp:[10.253.38.9]:6642,tcp:[10.253.38.5]:6642,tcp:[10.253.38.10]:6642", rundir="/var/run/openvswitch", system-id=xc03-compute3} iface_types : [afxdp, afxdp-nonpmd, bareudp, dpdk, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, gtpu, internal, ip6erspan, ip6gre, lisp, patch, srv6, stt, system, tap, vxlan] manager_options : [] next_cfg : 55 other_config : {bundle-idle-timeout="3600", dpdk-extra=" -a 0000:18:00.1 -a 0000:18:00.0", dpdk-init="true", dpdk-socket-mem="2048", n-handler-threads="1", pmd-cpu-mask="0xf", vlan-limit="0"} ovs_version : "3.3.1" ssl : [] statistics : {} system_type : cclinux system_version : "22.09.2"
ovs-vsctl get interface tun_port_p0 status {bus_info="bus_name=pci, vendor_id=8086, device_id=10fb", driver_name=net_ixgbe, if_descr="DPDK 23.11.0 net_ixgbe", if_type="6", link_speed="10Gbps", max_hash_mac_addrs="4096", max_mac_addrs="127", max_rx_pktlen="1618", max_rx_queues="128", max_tx_queues="64", max_vfs="0", max_vmdq_pools="64", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="0", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}`
Regarding E810, it was observed that there was an abnormal printing message after I created the vhost user client port. I suspect that the 82599 network card does not support tx_out_udp_csum_offload and tx_out_ip_csum_offload, which is causing the issue。
@wangjun0728 thanks for the info! This looks very similar to what is supposed to be fixed in https://patchwork.ozlabs.org/project/openvswitch/patch/20240226133837.533820-1-mkp@redhat.com/ . Could you confirm that you have this patch in your version of OVS?
CC: @mkp-rh
@igsilya
The patch modification is included in my code. I've previously discussed this issue with Mike. This patch resolved the issue with my Mellanox network card, but in the case of Intel network cards (82599 and E810), there are anomalies with the Geneve overlay.
Additionally, the latest code I'm using is this one:https://github.com/openvswitch/ovs/commits/branch-3.3/
The checksum offload capability of Intel network cards indeed differs from Mellanox network cards. I believe this might be the root cause of the issue, as it seems more like a problem with the DPDK-side driver.
I think there's somewhat of a hint provided here:
Output batch contains invalid packets. Only 0/1 are valid: Operation not supported
There are very few places where DPDK will return ENOTSUPP. I don't have an E810 card right now, but will try to investigate the code.
@mkp-rh note that Operation not supported
is on 82599 card. E810 doesn't reject packets but throws MDD events.
For the MDD issue, I see that the E810 errata page reports:
Some of the Tx Data checks performed as part of the Malicious Driver Detection (MDD) are reported as anti-spoof failures in addition to the actual failures
So it could be the MDD anti-spoofing features, or a general tx data check failure.
In the ixgbe driver, ixgbe_prep_pkts only returns ENOTSUP if the ol_flags are incorrect.
From the log above I see ol_flags=0x800800000000182, which when translates into the following tx offload flags:
RTE_MBUF_F_TX_TUNNEL_GENEVE RTE_MBUF_F_TX_OUTER_IPV4
ixgbe_rxtx.c contains the supported IXGBE_TX_OFFLOAD_MASK, which doesn't include RTE_MBUF_F_TX_TUNNEL_GENEVE. So that flag shouldn't be included when we send the frame.
RTE_MBUF_F_TX_TUNNEL_GENEVE RTE_MBUF_F_TX_OUTER_IPV4
ixgbe_rxtx.c contains the supported IXGBE_TX_OFFLOAD_MASK, which doesn't include RTE_MBUF_F_TX_TUNNEL_GENEVE. So that flag shouldn't be included when we send the frame.
So, if we do not request TSO or inner checksumming we must not specify RTE_MBUF_F_TX_TUNNEL_*
flags. Right?
IIUC, we need https://github.com/openvswitch/ovs/commit/9b7e1a75378f806fcf782e0286d529028e6d62bf but for tunnels.
@mkp-rh Hmm, also the RTE_MBUF_F_TX_OUTER_IPV4
is not set, while it is required for RTE_MBUF_F_TX_OUTER_IP_CKSUM
according to the API. And it seems https://github.com/openvswitch/ovs/commit/9b7e1a75378f806fcf782e0286d529028e6d62bf check is not really correct as it doesn't seem to cover all the outer/inner cases.
Edit: Nevermind, wrong flag. But the existing check might still be incomplete.
Hi @igsilya @mkp-rh , if you have suggestions for modifications, I have the environment for E810 and 82599 network cards to verify.
@wangjun0728 Could you try this one: https://github.com/igsilya/ovs/commit/00c0a91f89084bf3ac333918c729fc7274f476e4 ? It should fix the 82599 case at least, I think.
@igsilya This looks great! Applying your modifications resolved the error with the 82599 network card, and I can now communicate without issues using iperf. Additionally, I've observed the E810 network card, and the MDD error still persists.
I also noticed a modification in the DPDK community, but applying it didn't yield any results. I suspect there might be a flaw in the E810 driver's support for tunnel TSO.
https://patches.dpdk.org/project/dpdk/patch/20231207023051.1914021-1-kaiwenx.deng@intel.com/
After enabling DPDK's PMD logs with the command --log-level=pmd,debug, I captured a portion of DPDK startup log information. Currently, it's unclear whether there's any definite correlation with the errors present.
2024-03-11T02:38:04.088Z|00007|dpdk|INFO|Using DPDK 23.11.0 2024-03-11T02:38:04.088Z|00008|dpdk|INFO|DPDK Enabled - initializing... 2024-03-11T02:38:04.088Z|00009|dpdk|INFO|dpdk init get port_num:2 2024-03-11T02:38:04.088Z|00010|dpdk|INFO|EAL ARGS: ovs-vswitchd -a 0000:af:00.1 -a 0000:af:00.0 --log-level=pmd,debug --socket-mem 2048 -l 0. 2024-03-11T02:38:04.091Z|00011|dpdk|INFO|EAL: Detected CPU lcores: 80 2024-03-11T02:38:04.091Z|00012|dpdk|INFO|EAL: Detected NUMA nodes: 2 2024-03-11T02:38:04.091Z|00013|dpdk|INFO|EAL: Detected static linkage of DPDK 2024-03-11T02:38:04.096Z|00014|dpdk|INFO|EAL: Multi-process socket /var/run/dpdk/rte/mp_socket 2024-03-11T02:38:04.099Z|00015|dpdk|INFO|EAL: Selected IOVA mode 'VA' 2024-03-11T02:38:04.100Z|00016|dpdk|WARN|EAL: No free 2048 kB hugepages reported on node 0 2024-03-11T02:38:04.100Z|00017|dpdk|WARN|EAL: No free 2048 kB hugepages reported on node 1 2024-03-11T02:38:04.101Z|00018|dpdk|INFO|EAL: VFIO support initialized 2024-03-11T02:38:04.839Z|00019|dpdk|INFO|EAL: Using IOMMU type 1 (Type 1) 2024-03-11T02:38:04.994Z|00020|dpdk|INFO|EAL: Ignore mapping IO port bar(1) 2024-03-11T02:38:04.994Z|00021|dpdk|INFO|EAL: Ignore mapping IO port bar(4) 2024-03-11T02:38:05.120Z|00022|dpdk|INFO|EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:af:00.0 (socket 1) 2024-03-11T02:38:05.586Z|00023|dpdk|INFO|ice_load_pkg_type(): Active package is: 1.3.28.0, ICE OS Default Package (single VLAN mode) 2024-03-11T02:38:05.586Z|00024|dpdk|INFO|ice_dev_init(): FW 5.3.-1521546806 API 1.7 2024-03-11T02:38:05.608Z|00025|dpdk|INFO|ice_flow_init(): Engine 4 disabled 2024-03-11T02:38:05.608Z|00026|dpdk|INFO|ice_fdir_setup(): FDIR HW Capabilities: fd_fltr_guar = 1024, fd_fltr_best_effort = 14336. 2024-03-11T02:38:05.612Z|00027|dpdk|INFO|vsi_queues_bind_intr(): queue 0 is binding to vect 257 2024-03-11T02:38:05.612Z|00028|dpdk|INFO|ice_fdir_setup(): FDIR setup successfully, with programming queue 0. 2024-03-11T02:38:05.736Z|00029|dpdk|INFO|EAL: Ignore mapping IO port bar(1) 2024-03-11T02:38:05.736Z|00030|dpdk|INFO|EAL: Ignore mapping IO port bar(4) 2024-03-11T02:38:05.839Z|00031|dpdk|INFO|EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:af:00.1 (socket 1) 2024-03-11T02:38:05.942Z|00032|dpdk|INFO|ice_load_pkg_type(): Active package is: 1.3.28.0, ICE OS Default Package (single VLAN mode) 2024-03-11T02:38:05.942Z|00033|dpdk|INFO|ice_dev_init(): FW 5.3.-1521546806 API 1.7 2024-03-11T02:38:05.965Z|00034|dpdk|INFO|ice_flow_init(): Engine 4 disabled 2024-03-11T02:38:05.965Z|00035|dpdk|INFO|ice_fdir_setup(): FDIR HW Capabilities: fd_fltr_guar = 1024, fd_fltr_best_effort = 14336. 2024-03-11T02:38:05.968Z|00036|dpdk|INFO|vsi_queues_bind_intr(): queue 0 is binding to vect 257 2024-03-11T02:38:05.968Z|00037|dpdk|INFO|ice_fdir_setup(): FDIR setup successfully, with programming queue 0. 2024-03-11T02:38:05.972Z|00038|dpdk|WARN|TELEMETRY: No legacy callbacks, legacy socket not created 2024-03-11T02:38:05.972Z|00039|dpdk|INFO|DPDK rte_pdump - initializing... 2024-03-11T02:38:05.977Z|00044|dpdk|INFO|DPDK Enabled - initialized 2024-03-11T02:38:06.223Z|00001|dpdk|INFO|ice_interrupt_handler(): OICR: link state change event 2024-03-11T02:38:06.406Z|00089|dpdk|INFO|Device with port_id=1 already stopped 2024-03-11T02:38:06.572Z|00090|dpdk|INFO|ice_set_rx_function(): Using AVX2 OFFLOAD Vector Rx (port 1). 2024-03-11T02:38:06.572Z|00091|dpdk|ERR|ice_vsi_config_outer_vlan_stripping(): Single VLAN mode (SVM) does not support qinq 2024-03-11T02:38:06.572Z|00092|dpdk|INFO|vsi_queues_bind_intr(): queue 1 is binding to vect 1 2024-03-11T02:38:06.572Z|00093|dpdk|INFO|vsi_queues_bind_intr(): queue 2 is binding to vect 1 2024-03-11T02:38:07.555Z|00002|dpdk|INFO|ice_interrupt_handler(): OICR: link state change event 2024-03-11T02:38:07.600Z|00102|dpdk|INFO|Device with port_id=0 already stopped 2024-03-11T02:38:07.623Z|00103|dpdk|INFO|ice_set_rx_function(): Using AVX2 OFFLOAD Vector Rx (port 0). 2024-03-11T02:38:07.624Z|00104|dpdk|ERR|ice_vsi_config_outer_vlan_stripping(): Single VLAN mode (SVM) does not support qinq 2024-03-11T02:38:07.624Z|00105|dpdk|INFO|vsi_queues_bind_intr(): queue 1 is binding to vect 1 2024-03-11T02:38:07.624Z|00106|dpdk|INFO|vsi_queues_bind_intr(): queue 2 is binding to vect 1
@wangjun0728 I posted the refined verion of the 82599 fix here: https://patchwork.ozlabs.org/project/openvswitch/patch/20240311183231.37253-1-i.maximets@ovn.org/ Could you check with this version? It has some extra checking, but I do not expect it to behave much different, i.e. it should fix the 82599 case, but should not affect the E810 problem.
For the E810, I still don't have a lot to suggest. One thing that might help understanding the situation better is to dump some of the mbufs we're trying to send. Maybe you can capture some logs with the following change applied:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8c52accff..331031035 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2607,6 +2607,17 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
(char *) dp_packet_eth(pkt);
mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
(char *) dp_packet_l3(pkt);
+ VLOG_WARN_RL(&rl, "%s: Tunnel offload:"
+ " outer_l2_len=%d"
+ " outer_l3_len=%d"
+ " l2_len=%d"
+ " l3_len=%d"
+ " l4_len=%d",
+ netdev_get_name(&dev->up),
+ mbuf->outer_l2_len, mbuf->outer_l3_len,
+ mbuf->l2_len, mbuf->l3_len, mbuf->l4_len);
+ netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
+ "Tunneled packet", mbuf);
} else {
mbuf->l2_len = (char *) dp_packet_l3(pkt) -
(char *) dp_packet_eth(pkt);
? It will spam the packets into the log, so definitely not recommended for a long-running test. But maybe it can shed some light on the problem.
@wangjun0728 Are you able to check if the following patch resolves your issue on E810?
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index df7bf8e6b..046acd8ba 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -597,12 +597,15 @@ dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t flags)
* support inner checksum offload and an outer UDP checksum is
* required, then we can't offload inner checksum either. As that would
* invalidate the outer checksum. */
- if (!(flags & NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM) &&
- dp_packet_hwol_is_outer_udp_cksum(p)) {
- flags &= ~(NETDEV_TX_OFFLOAD_TCP_CKSUM |
- NETDEV_TX_OFFLOAD_UDP_CKSUM |
- NETDEV_TX_OFFLOAD_SCTP_CKSUM |
- NETDEV_TX_OFFLOAD_IPV4_CKSUM);
+ if (!(flags & NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM)) {
+ if (dp_packet_hwol_is_outer_udp_cksum(p)) {
+ flags &= ~(NETDEV_TX_OFFLOAD_TCP_CKSUM |
+ NETDEV_TX_OFFLOAD_UDP_CKSUM |
+ NETDEV_TX_OFFLOAD_SCTP_CKSUM |
+ NETDEV_TX_OFFLOAD_IPV4_CKSUM);
+ }
+ *dp_packet_ol_flags_ptr(p) &= ~(DP_PACKET_OL_TX_TUNNEL_GENEVE |
+ DP_PACKET_OL_TX_TUNNEL_VXLAN);
}
}
@wangjun0728 I posted the refined verion of the 82599 fix here: https://patchwork.ozlabs.org/project/openvswitch/patch/20240311183231.37253-1-i.maximets@ovn.org/ Could you check with this version? It has some extra checking, but I do not expect it to behave much different, i.e. it should fix the 82599 case, but should not affect the E810 problem.
Thank you very much. I've validated this patch, and it seems everything is fine with 82599. There are no log error messages either, which is great.
For the E810, I still don't have a lot to suggest. One thing that might help understanding the situation better is to dump some of the mbufs we're trying to send. Maybe you can capture some logs with the following change applied:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 8c52accff..331031035 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2607,6 +2607,17 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) (char *) dp_packet_eth(pkt); mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) - (char *) dp_packet_l3(pkt); + VLOG_WARN_RL(&rl, "%s: Tunnel offload:" + " outer_l2_len=%d" + " outer_l3_len=%d" + " l2_len=%d" + " l3_len=%d" + " l4_len=%d", + netdev_get_name(&dev->up), + mbuf->outer_l2_len, mbuf->outer_l3_len, + mbuf->l2_len, mbuf->l3_len, mbuf->l4_len); + netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up), + "Tunneled packet", mbuf); } else { mbuf->l2_len = (char *) dp_packet_l3(pkt) - (char *) dp_packet_eth(pkt);
? It will spam the packets into the log, so definitely not recommended for a long-running test. But maybe it can shed some light on the problem.
I applied your modification and enabled debug logging mode for netdev_dpdk. Below are some log prints; hopefully, they will be helpful to you.
2024-03-12T06:22:57.262Z|00012|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=38 l3_len=20 l4_len=32
2024-03-12T06:22:57.262Z|00013|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18f34ee40, iova=0x18f34f100, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x18f34ee40, data=0x18f34f142, len=128, off=66, refcnt=1
Dump data at [0x18f34f142], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B E7 36 17 C1 00 5A FF FF 02 40 | &7..&;.6...Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 CB BB 0B 9A 8A AE 15 7E 36 20 80 12 FA F0 | .........~6 ....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-12T06:22:58.268Z|00014|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18f34e280, iova=0x18f34e540, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x18f34e280, data=0x18f34e582, len=128, off=66, refcnt=1
Dump data at [0x18f34e582], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B E7 36 17 C1 00 5A FF FF 02 40 | &7..&;.6...Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 CB BB 0B 9A 8A AE 15 7E 36 20 80 12 FA F0 | .........~6 ....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-12T06:22:59.320Z|00015|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18f34d6c0, iova=0x18f34d980, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x18f34d6c0, data=0x18f34d9c2, len=128, off=66, refcnt=1
Dump data at [0x18f34d9c2], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B E7 36 17 C1 00 5A FF FF 02 40 | &7..&;.6...Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 CB BB 0B 9A 8A AE 15 7E 36 20 80 12 FA F0 | .........~6 ....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-12T06:23:00.278Z|00016|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18f34cb00, iova=0x18f34cdc0, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x18f34cb00, data=0x18f34ce02, len=128, off=66, refcnt=1
Dump data at [0x18f34ce02], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B E7 36 17 C1 00 5A FF FF 02 40 | &7..&;.6...Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 CB BB 0B 9A 8A AE 15 7E 36 20 80 12 FA F0 | .........~6 ....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-12T06:23:15.211Z|00031|netdev_dpdk(pmd-c02/id:88)|WARN|Dropped 3 log messages in last 17 seconds (most recently, 15 seconds ago) due to excessive rate
2024-03-12T06:23:15.211Z|00032|netdev_dpdk(pmd-c02/id:88)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=38 l3_len=20 l4_len=8
2024-03-12T06:23:15.212Z|00033|netdev_dpdk(pmd-c02/id:88)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18eeda200, iova=0x18eeda4c0, buf_len=2176
pkt_len=152, ol_flags=0xcb0820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x18eeda200, data=0x18eeda502, len=152, off=66, refcnt=1
Dump data at [0x18eeda502], len=152
00000000: 40 A6 B7 21 92 8C B4 96 91 BC 45 7B 81 00 00 5C | @..!......E{...\
00000010: 08 00 45 00 00 86 00 00 40 00 40 11 00 00 0A FD | ..E.....@.@.....
00000020: 26 37 0A FD 26 36 B7 8C 17 C1 00 72 FF FF 02 40 | &7..&6.....r...@
00000030: 65 58 00 00 30 00 01 02 80 01 00 02 00 04 06 75 | eX..0..........u
00000040: CA 23 3F 44 02 81 5E AC BE 89 08 00 45 00 00 4C | .#?D..^.....E..L
00000050: 98 86 00 00 3F 11 CD 00 0A 00 00 0B 0B 0B 01 05 | ....?...........
00000060: DD 51 00 7B 00 38 16 64 23 00 06 20 00 00 00 00 | .Q.{.8.d#.. ....
00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................
00000090: 41 6A 12 2A 8F 80 DD 7A | Aj.*...z
2024-03-12T06:23:36.248Z|00595|netdev_dpdk|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0
2024-03-12T06:23:36.248Z|00596|netdev_dpdk|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x11d4754c40, iova=0x11d4754f00, buf_len=2176
pkt_len=132, ol_flags=0xd00820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x11d4754c40, data=0x11d4754f80, len=132, off=128, refcnt=1
Dump data at [0x11d4754f80], len=132
00000000: 08 C0 EB AF 0D 3F B4 96 91 BC 45 7B 81 00 00 5C | .....?....E{...\
00000010: 08 00 45 00 00 72 00 00 40 00 40 11 00 00 0A FD | ..E..r..@.@.....
00000020: 26 37 0A FD 26 32 BB 80 17 C1 00 5E FF FF 02 40 | &7..&2.....^...@
00000030: 65 58 00 00 12 00 01 02 80 01 00 0B 80 00 33 33 | eX............33
00000040: 00 00 00 02 0A 90 F1 D7 BB A1 86 DD 60 00 00 00 | ............`...
00000050: 00 10 3A FF FE 80 00 00 00 00 00 00 08 90 F1 FF | ..:.............
00000060: FE D7 BB A1 FF 02 00 00 00 00 00 00 00 00 00 00 | ................
00000070: 00 00 00 02 85 00 0F 1B 00 00 00 00 01 01 0A 90 | ................
00000080: F1 D7 BB A1 | ....
2024-03-12T06:23:36.248Z|00597|netdev_dpdk|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0
2024-03-12T06:23:36.248Z|00598|netdev_dpdk|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x11d4755800, iova=0x11d4755ac0, buf_len=2176
pkt_len=132, ol_flags=0xd00820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x11d4755800, data=0x11d4755b40, len=132, off=128, refcnt=1
Dump data at [0x11d4755b40], len=132
00000000: 68 91 D0 65 C6 C3 B4 96 91 BC 45 7B 81 00 00 5C | h..e......E{...\
00000010: 08 00 45 00 00 72 00 00 40 00 40 11 00 00 0A FD | ..E..r..@.@.....
00000020: 26 37 0A FD 26 38 BB 80 17 C1 00 5E FF FF 02 40 | &7..&8.....^...@
00000030: 65 58 00 00 12 00 01 02 80 01 00 0B 80 00 33 33 | eX............33
00000040: 00 00 00 02 0A 90 F1 D7 BB A1 86 DD 60 00 00 00 | ............`...
00000050: 00 10 3A FF FE 80 00 00 00 00 00 00 08 90 F1 FF | ..:.............
00000060: FE D7 BB A1 FF 02 00 00 00 00 00 00 00 00 00 00 | ................
00000070: 00 00 00 02 85 00 0F 1B 00 00 00 00 01 01 0A 90 | ................
00000080: F1 D7 BB A1 | ....
2024-03-12T06:23:36.248Z|00010|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
@wangjun0728 Are you able to check if the following patch resolves your issue on E810?
diff --git a/lib/dp-packet.c b/lib/dp-packet.c index df7bf8e6b..046acd8ba 100644 --- a/lib/dp-packet.c +++ b/lib/dp-packet.c @@ -597,12 +597,15 @@ dp_packet_ol_send_prepare(struct dp_packet *p, uint64_t flags) * support inner checksum offload and an outer UDP checksum is * required, then we can't offload inner checksum either. As that would * invalidate the outer checksum. */ - if (!(flags & NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM) && - dp_packet_hwol_is_outer_udp_cksum(p)) { - flags &= ~(NETDEV_TX_OFFLOAD_TCP_CKSUM | - NETDEV_TX_OFFLOAD_UDP_CKSUM | - NETDEV_TX_OFFLOAD_SCTP_CKSUM | - NETDEV_TX_OFFLOAD_IPV4_CKSUM); + if (!(flags & NETDEV_TX_OFFLOAD_OUTER_UDP_CKSUM)) { + if (dp_packet_hwol_is_outer_udp_cksum(p)) { + flags &= ~(NETDEV_TX_OFFLOAD_TCP_CKSUM | + NETDEV_TX_OFFLOAD_UDP_CKSUM | + NETDEV_TX_OFFLOAD_SCTP_CKSUM | + NETDEV_TX_OFFLOAD_IPV4_CKSUM); + } + *dp_packet_ol_flags_ptr(p) &= ~(DP_PACKET_OL_TX_TUNNEL_GENEVE | + DP_PACKET_OL_TX_TUNNEL_VXLAN); } }
Hi, after applying your modification, the error logs for E810 still persist, and there are also additional error logs stating "ip packet has invalid checksum". Moreover, I've noticed this error log in other network card environments as well.
E810: 2024-03-12T05:40:14.226Z|00021|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:14.229Z|00022|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:14.257Z|00023|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:14.395Z|00024|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:14.435Z|00025|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:14.723Z|00026|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:16.257Z|00027|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:18.238Z|00028|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:18.257Z|00029|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:25.849Z|00030|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:37.289Z|00530|connmgr|INFO|br-int<->unix#2: 5 flow_mods 56 s ago (3 adds, 2 deletes) 2024-03-12T05:40:39.194Z|00008|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:40:40.193Z|00009|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:40:41.120Z|00031|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:40:42.194Z|00010|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:40:46.205Z|00011|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:40:54.210Z|00012|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:41:10.368Z|00032|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-12T05:42:11.397Z|00033|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
82599: 2024-03-12T05:43:39.300Z|00030|native_tnl(pmd-c03/id:86)|WARN|ip packet has invalid checksum 2024-03-12T05:43:39.508Z|00001|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:43:41.521Z|00002|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:43:45.535Z|00003|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum 2024-03-12T05:43:53.541Z|00004|native_tnl(pmd-c02/id:88)|WARN|ip packet has invalid checksum
Thanks @wangjun0728 !
The ol_flags for the packet that might be contributing to MDD failures are:
0xd00820000000002
RTE_MBUF_F_TX_OUTER_UDP_CKSUM
RTE_MBUF_F_TX_TUNNEL_GENEVE
RTE_MBUF_F_TX_IPV6
RTE_MBUF_F_TX_OUTER_IP_CKSUM
RTE_MBUF_F_TX_OUTER_IPV4
It is an ICMPv6 packet encapsulated in IPv4 Geneve tunnel, so the flags seem correct at a first glance, but I wonder if the driver gets confused by the mix of IPv6 and IPv4 flags or simply by the existence of the inner IPv6 mark while inner offloads are not requested. The RTE_MBUF_F_TX_IPV6
should not be technically needed here, so we might just clear it?
Maybe something like this on top of the 82599 patch would help with E810 case:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8c52accff..270d3e11c 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2607,6 +2607,15 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
(char *) dp_packet_eth(pkt);
mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
(char *) dp_packet_l3(pkt);
+
+ /* If neither inner checksums nor TSO is requested, inner marks
+ * should not be set. */
+ if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM |
+ RTE_MBUF_F_TX_L4_MASK |
+ RTE_MBUF_F_TX_TCP_SEG))) {
+ mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 |
+ RTE_MBUF_F_TX_IPV6);
+ }
} else {
mbuf->l2_len = (char *) dp_packet_l3(pkt) -
(char *) dp_packet_eth(pkt);
Could you try?
=============================================================
Another packet is a TCP packet in Geneve tunnel, it has:
0xc90820000000002
RTE_MBUF_F_TX_TCP_CKSUM
RTE_MBUF_F_TX_IPV4
RTE_MBUF_F_TX_OUTER_IP_CKSUM
RTE_MBUF_F_TX_OUTER_IPV4
RTE_MBUF_F_TX_OUTER_UDP_CKSUM
RTE_MBUF_F_TX_TUNNEL_GENEVE
This seems correct, it will also gain RTE_MBUF_F_TX_IP_CKSUM
in the end of processing, so should be fine. I don't see anything that can be wrong with this one.
And one more packet is a UDP (NTP) inside of the Geneve tunnel:
0xcb0820000000002
RTE_MBUF_F_TX_UDP_CKSUM
RTE_MBUF_F_TX_IPV4
RTE_MBUF_F_TX_OUTER_IP_CKSUM
RTE_MBUF_F_TX_OUTER_IPV4
RTE_MBUF_F_TX_OUTER_UDP_CKSUM
RTE_MBUF_F_TX_TUNNEL_GENEVE
This one also seems fine, however the mbuf->ol_flags & RTE_MBUF_F_TX_TCP_CKSUM
is an incorrect check in the netdev_dpdk_prep_hwol_packet()
function, because L4 checksum bits are not just bits, they are bit fields. RTE_MBUF_F_TX_UDP_CKSUM
is a two-bit field. It happens to match a single bit in TCP checkum filed, so it should gain the RTE_MBUF_F_TX_IP_CKSUM
correctly. However, it will also get tso_segsz
iniitalized sith some data and some other UDP packets may get some garbase set in l4_len
. So, the correct check should be something like this:
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8c52accff..4e516c3f8 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2625,7 +2634,7 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
}
}
- if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_CKSUM) {
+ if ((mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK) == RTE_MBUF_F_TX_TCP_CKSUM) {
if (!th) {
VLOG_WARN_RL(&rl, "%s: TCP offloading without L4 header"
" pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len);
@@ -2652,11 +2661,14 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
return false;
}
}
+ }
- if (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) {
- mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM;
- }
+ /* If L4 checksum offload is requested, IPv4 should be requested as well. */
+ if (mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK
+ && mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) {
+ mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM;
}
+
return true;
}
Maybe worth trying this as well.
Worth noting that these packets also have a vlan header in the set of outer headers, but this should not cause any issues as offsets seem to be correct.
Another thing that may be an issue or may be not is that l2_len
is technically incorrect for packets that do not request inner checksum offload. For example outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0
. Here the l2_len
doesn't include the outer L4 length, while it should since the packet is a tunnel packet. In fact, they l2_len
and l3_len
look like a direct copy of the outer lengths and not actual lengths of the inner packet. Since we do not request any offloading on the inner header, having incorrect l2_len
might be fine, but it may as well not be if the driver sets up something weird in the hardware because of this.
@igsilya Thanks for your reply. I modified it according to your suggestion and added dump printing, but there is still a problem.
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2ec0f6c6e1459fe3dc0614140c37fdcbdbb228ff..375eb78119c43433aa499a79fe3ff30251d48d13 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -2617,6 +2617,26 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
(char *) dp_packet_eth(pkt);
mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
(char *) dp_packet_l3(pkt);
+
+ /* If neither inner checksums nor TSO is requested, inner marks
+ * should not be set. */
+ if (!(mbuf->ol_flags & (RTE_MBUF_F_TX_IP_CKSUM |
+ RTE_MBUF_F_TX_L4_MASK |
+ RTE_MBUF_F_TX_TCP_SEG))) {
+ mbuf->ol_flags &= ~(RTE_MBUF_F_TX_IPV4 |
+ RTE_MBUF_F_TX_IPV6);
+ }
+ VLOG_WARN_RL(&rl, "%s: Tunnel offload:"
+ " outer_l2_len=%d"
+ " outer_l3_len=%d"
+ " l2_len=%d"
+ " l3_len=%d"
+ " l4_len=%d",
+ netdev_get_name(&dev->up),
+ mbuf->outer_l2_len, mbuf->outer_l3_len,
+ mbuf->l2_len, mbuf->l3_len, mbuf->l4_len);
+ netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
+ "Tunneled packet", mbuf);
} else {
mbuf->l2_len = (char *) dp_packet_l3(pkt) -
(char *) dp_packet_eth(pkt);
@@ -2635,7 +2655,7 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
}
}
- if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_CKSUM) {
+ if ((mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK) == RTE_MBUF_F_TX_TCP_CKSUM) {
if (!th) {
VLOG_WARN_RL(&rl, "%s: TCP offloading without L4 header"
" pkt len: %"PRIu32"", dev->up.name, mbuf->pkt_len);
@@ -2662,11 +2682,14 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
return false;
}
}
+ }
- if (mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) {
- mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM;
- }
+ /* If L4 checksum offload is requested, IPv4 should be requested as well. */
+ if (mbuf->ol_flags & RTE_MBUF_F_TX_L4_MASK
+ && mbuf->ol_flags & RTE_MBUF_F_TX_IPV4) {
+ mbuf->ol_flags |= RTE_MBUF_F_TX_IP_CKSUM;
}
+
return true;
}
2024-03-13T06:05:52.058Z|00025|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=38 l3_len=20 l4_len=32
2024-03-13T06:05:52.058Z|00026|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18eeda200, iova=0x18eeda4c0, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nbsegs=1, port=65535, ptype=0
segment at 0x18eeda200, data=0x18eeda502, len=128, off=66, refcnt=1
Dump data at [0x18eeda502], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B BA 1F 17 C1 00 5A FF FF 02 40 | &7..&;.....Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 E1 29 11 B7 E7 77 2C 3F 64 5F 80 12 FA F0 | ...)...w,?d....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-13T06:05:53.072Z|00027|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=38 l3_len=20 l4_len=32
2024-03-13T06:05:53.072Z|00028|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18eed9640, iova=0x18eed9900, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nbsegs=1, port=65535, ptype=0
segment at 0x18eed9640, data=0x18eed9942, len=128, off=66, refcnt=1
Dump data at [0x18eed9942], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B BA 1F 17 C1 00 5A FF FF 02 40 | &7..&;.....Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 E1 29 11 B7 E7 77 2C 3F 64 5F 80 12 FA F0 | ...)...w,?d....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-13T06:05:54.081Z|00029|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=38 l3_len=20 l4_len=32
2024-03-13T06:05:54.081Z|00030|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18eed8a80, iova=0x18eed8d40, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nbsegs=1, port=65535, ptype=0
segment at 0x18eed8a80, data=0x18eed8d82, len=128, off=66, refcnt=1
Dump data at [0x18eed8d82], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B BA 1F 17 C1 00 5A FF FF 02 40 | &7..&;.....Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 E1 29 11 B7 E7 77 2C 3F 64 5F 80 12 FA F0 | ...)...w,?d....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-13T06:05:55.080Z|00031|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x18eed7ec0, iova=0x18eed8180, buf_len=2176
pkt_len=128, ol_flags=0xc90820000000002, nbsegs=1, port=65535, ptype=0
segment at 0x18eed7ec0, data=0x18eed81c2, len=128, off=66, refcnt=1
Dump data at [0x18eed81c2], len=128
00000000: A0 88 C2 20 00 7E B4 96 91 BC 45 7B 81 00 00 5C | ... .~....E{...\
00000010: 08 00 45 00 00 6E 00 00 40 00 40 11 00 00 0A FD | ..E..n..@.@.....
00000020: 26 37 0A FD 26 3B BA 1F 17 C1 00 5A FF FF 02 40 | &7..&;.....Z...@
00000030: 65 58 00 00 2D 00 01 02 80 01 00 08 00 04 40 FE | eX..-.........@.
00000040: 95 EF 85 2C 0A 8B BF 77 86 35 08 00 45 00 00 34 | ...,...w.5..E..4
00000050: 00 00 40 00 3F 06 59 AA 0A 00 00 0B 0A 38 CD D7 | ..@.?.Y......8..
00000060: 00 16 E1 29 11 B7 E7 77 2C 3F 64 5F 80 12 FA F0 | ...)...w,?d....
00000070: E2 40 00 00 02 04 05 B4 01 01 04 02 01 03 03 09 | .@..............
2024-03-13T06:07:07.088Z|00032|netdev_dpdk(pmd-c00/id:89)|WARN|Dropped 1 log messages in last 72 seconds (most recently, 72 seconds ago) due to excessive rate
2024-03-13T06:07:07.088Z|00033|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0
2024-03-13T06:07:07.088Z|00034|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x11d2f6ff00, iova=0x11d2f701c0, buf_len=2176
pkt_len=124, ol_flags=0xc00820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x11d2f6ff00, data=0x11d2f70240, len=124, off=128, refcnt=1
Dump data at [0x11d2f70240], len=124
00000000: 40 A6 B7 21 92 8C B4 96 91 BC 45 7B 81 00 00 5C | @..!......E{...\
00000010: 08 00 45 00 00 6A 00 00 40 00 40 11 00 00 0A FD | ..E..j..@.@.....
00000020: 26 37 0A FD 26 36 AE 80 17 C1 00 56 FF FF 02 40 | &7..&6.....V...@
00000030: 65 58 00 00 31 00 01 02 80 01 00 05 80 00 33 33 | eX..1.........33
00000040: 00 00 00 02 06 D8 CE 6A 6F 48 86 DD 60 00 97 93 | .......joH..... 00000050: 00 08 3A FF FE 80 00 00 00 00 00 00 25 8F 09 39 | ..:.........%..9 00000060: 36 02 3D 47 FF 02 00 00 00 00 00 00 00 00 00 00 | 6.=G............ 00000070: 00 00 00 02 85 00 DB 25 00 00 00 00 | .......%.... 2024-03-13T06:07:07.088Z|00035|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0 2024-03-13T06:07:07.088Z|00036|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet: dump mbuf at 0x11d2f70ac0, iova=0x11d2f70d80, buf_len=2176 pkt_len=124, ol_flags=0xc00820000000002, nb_segs=1, port=65535, ptype=0 segment at 0x11d2f70ac0, data=0x11d2f70e00, len=124, off=128, refcnt=1 Dump data at [0x11d2f70e00], len=124 00000000: 68 91 D0 65 C6 C3 B4 96 91 BC 45 7B 81 00 00 5C | h..e......E{...\ 00000010: 08 00 45 00 00 6A 00 00 40 00 40 11 00 00 0A FD | ..E..j..@.@..... 00000020: 26 37 0A FD 26 38 AE 80 17 C1 00 56 FF FF 02 40 | &7..&8.....V...@ 00000030: 65 58 00 00 31 00 01 02 80 01 00 05 80 00 33 33 | eX..1.........33 00000040: 00 00 00 02 06 D8 CE 6A 6F 48 86 DD 60 00 97 93 | .......joH..
...
00000050: 00 08 3A FF FE 80 00 00 00 00 00 00 25 8F 09 39 | ..:.........%..9
00000060: 36 02 3D 47 FF 02 00 00 00 00 00 00 00 00 00 00 | 6.=G............
00000070: 00 00 00 02 85 00 DB 25 00 00 00 00 | .......%....
2024-03-13T06:07:07.088Z|00037|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=18 l3_len=20 l4_len=0
2024-03-13T06:07:07.088Z|00038|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet:
dump mbuf at 0x11d2f71680, iova=0x11d2f71940, buf_len=2176
pkt_len=124, ol_flags=0xc00820000000002, nb_segs=1, port=65535, ptype=0
segment at 0x11d2f71680, data=0x11d2f719c0, len=124, off=128, refcnt=1
Dump data at [0x11d2f719c0], len=124
00000000: 08 C0 EB AF 0D 3F B4 96 91 BC 45 7B 81 00 00 5C | .....?....E{...\
00000010: 08 00 45 00 00 6A 00 00 40 00 40 11 00 00 0A FD | ..E..j..@.@.....
00000020: 26 37 0A FD 26 32 AE 80 17 C1 00 56 FF FF 02 40 | &7..&2.....V...@
00000030: 65 58 00 00 31 00 01 02 80 01 00 05 80 00 33 33 | eX..1.........33
00000040: 00 00 00 02 06 D8 CE 6A 6F 48 86 DD 60 00 97 93 | .......joH..... 00000050: 00 08 3A FF FE 80 00 00 00 00 00 00 25 8F 09 39 | ..:.........%..9 00000060: 36 02 3D 47 FF 02 00 00 00 00 00 00 00 00 00 00 | 6.=G............ 00000070: 00 00 00 02 85 00 DB 25 00 00 00 00 | .......%.... 2024-03-13T06:07:07.088Z|00039|netdev_dpdk(pmd-c00/id:89)|WARN|tun_port_p0: Tunnel offload: outer_l2_len=18 outer_l3_len=20 l2_len=0 l3_len=0 l4_len=0 2024-03-13T06:07:07.088Z|00040|netdev_dpdk(pmd-c00/id:89)|DBG|tun_port_p0: Tunneled packet: dump mbuf at 0x18eed7300, iova=0x18eed75c0, buf_len=2176 pkt_len=124, ol_flags=0xc00820000000002, nb_segs=1, port=65535, ptype=0 segment at 0x18eed7300, data=0x18eed7602, len=124, off=66, refcnt=1 Dump data at [0x18eed7602], len=124 00000000: 6C FE 54 2F 0D C0 B4 96 91 BC 45 7B 81 00 00 5C | l.T/......E{...\ 00000010: 08 00 45 00 00 6A 00 00 40 00 40 11 00 00 0A FD | ..E..j..@.@..... 00000020: 26 37 0A FD 26 39 AE 80 17 C1 00 56 FF FF 02 40 | &7..&9.....V...@ 00000030: 65 58 00 00 31 00 01 02 80 01 00 05 80 00 33 33 | eX..1.........33 00000040: 00 00 00 02 06 D8 CE 6A 6F 48 86 DD 60 00 97 93 | .......joH..
...
00000050: 00 08 3A FF FE 80 00 00 00 00 00 00 25 8F 09 39 | ..:.........%..9
00000060: 36 02 3D 47 FF 02 00 00 00 00 00 00 00 00 00 00 | 6.=G............
00000070: 00 00 00 02 85 00 DB 25 00 00 00 00 | .......%....
2024-03-13T06:07:07.088Z|00010|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
Hi, are there any other modification suggestions for the E810 network card, or do you need to modify it from the ice driver?
(jumping in the thread) A MDD event can be associated with a "wrong" (from the hw pov) Tx descriptor.
Could you please set --log-level=pmd.net.ice.*:debug ?
I captured packets on the E810 sender because it supports inner and outer layer offloading, and the packets look normal.
Then, when I captured packets on the receiving end and inspected them, I found that the outer UDP checksum was incorrect. I suspect this might be causing the issue. The network card I'm using supports outer checksum offload, but the actual packets don't seem to have undergone outer layer offloading.
Furthermore, I applied this patch, but it had no effect; the issue still persists. @david-marchand https://git.dpdk.org/dpdk/commit?id=daac90272857812b3da1db95caf5922f03a83343
After disabling the outer UDP checksum offload of the E810 network card, I verified that network communication on my end was normal. However, the 'MDD event' still persists, although it has resolved the issue I was experiencing. I believe this might be due to the DPDK ice driver not supporting outer UDP checksum offload but still enabling the flag, causing this issue. This modification can be considered a temporary step back, awaiting resolution from the DPDK ice driver before re-enabling the feature. @igsilya CC @mkp-rh @david-marchand
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ea18eeb2d6ee1fb8bf9d9bedb95416db4daf5b99..fa8af37cd451576060a24514506ce66a365a4be9 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1364,6 +1364,12 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_TCP_CKSUM;
}
+ if (!strcmp(info.driver_name, "net_ice")) {
+ VLOG_INFO("%s: disabled Tx outer udp checksum offloads for a net/ice port.",
+ netdev_get_name(&dev->up));
+ info.tx_offload_capa &= ~RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM;
+ }
+
if (info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_IPV4_CKSUM) {
dev->hw_ol_features |= NETDEV_TX_IPV4_CKSUM_OFFLOAD;
} else {
@wangjun0728 thanks for extra testing!
I think, it is reasonable to disable offloading for this driver for now. Do you want to send a proper patch to dev@openvswitch.org (contributing guide )? If not, I can pick up the change and send it myself.
Hi, it is normal to use geneve overlay without turning on userspace-tso-enable. However, if I configure userspace-tso-enable=true, there will be traffic when iperf sends tcp messages and it cannot send a large amount of traffic. . And I captured the packet at the receiving end and checked, and there was indeed an outer udp length exception. And only when sending TCP packets, there will be exceptions. If you use UDP packets to send packets, it will be normal. @igsilya
Additionally, I applied this patch:https://patchwork.ozlabs.org/project/openvswitch/patch/20240221040855.271921-1-mkp@redhat.com/ CC @mkp-rh
TCP:
UDP:
When this patch is not applied, the packet capture phenomenon has the same effect. TCP cannot transmit a large amount of traffic.
https://patchwork.ozlabs.org/project/openvswitch/patch/20240221040855.271921-1-mkp@redhat.com/
(jumping in the thread) A MDD event can be associated with a "wrong" (from the hw pov) Tx descriptor.
- I suspect the vector tx handler does not support tunneling offload, but I am not sure (looking at the logs in this thread) which handler has been selected by the net/ice driver.
Could you please set --log-level=pmd.net.ice.*:debug ?
- If you confirm that a vector tx handler has been selected witht the debug logs, I suggest applying this DPDK fix: https://git.dpdk.org/dpdk/commit?id=daac90272857812b3da1db95caf5922f03a83343
set --log-level=pmd.net.ice.*:debug dpdk.log
@wangjun0728 The incorrect length in the outer UDP header might be a bug in https://patchwork.ozlabs.org/project/openvswitch/patch/20240221040855.271921-1-mkp@redhat.com/ . The patch wasn't reviewed yet, and on a quick glance it might indeed be missing the update for the outer UDP header.
You need a card capable of Tunnel TSO in order to have good performance. The userspace fallback implemented in the patch will not be very fast even if the UDP length is fixed, because it performs way too many operations including large memory copies. The case without this patch is likely just dropping large packets that iperf is trying to send, so TCP stack is trying to adjust for the maximum packet size it can actually send and that reflects in the very bad performance. TCP suffers much harder than UDP, because UDP just fragments packets on the sender as we do not advertise support for UFO.
The card capable of Tunnel TSO in your case is E810, but you disabled outer checksum offload, so Tunnel TSO will not work.
Because the network cards I'm using, E810/82599/CX5, do not support tx_out_udp_csum_offload, can I not enable userspace-tso-enable?
Because the network cards I'm using, E810/82599/CX5, do not support tx_out_udp_csum_offload, can I not enable userspace-tso-enable?
In the current state of OVS development all the tunneled traffic will be dropped: https://github.com/openvswitch/ovs/blob/9d0a40120f9f71ed9ddf32d37d1b03b0fd7f4703/lib/netdev.c#L917-L932
The patch from @mkp-rh that you mentioned will add support for segmenting packets in software before sending them out in this case, i.e. not just dropping them, but it is not going to be very fast, so it might be faster to just let the sender segment packets before sending them out, but I didn't test.
Hi,I use "ovs-appctl coverage/read-counter netdev_geneve_tso_drops" to check the value, which is zero, and there are no related error logs printed either. Next, I will try with X710 to see if its DPDK driver supports the outer UDP checksum and tunnel TSO capabilities.
Hi,I use "ovs-appctl coverage/read-counter netdev_geneve_tso_drops" to check the value, which is zero, and there are no related error logs printed either.
This is on E810, right?
I think, we need to extend your patch to not only disable RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM
, but also disable all the dependent offloads like RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO
and RTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO
. With that you should see the log and the counter. Devices should not advertise tunnel TSO if they do not support outer checksums.
Hi,I use "ovs-appctl coverage/read-counter netdev_geneve_tso_drops" to check the value, which is zero, and there are no related error logs printed either.
This is on E810, right? I think, we need to extend your patch to not only disable
RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM
, but also disable all the dependent offloads likeRTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO
andRTE_ETH_TX_OFFLOAD_GENEVE_TNL_TSO
. With that you should see the log and the counter. Devices should not advertise tunnel TSO if they do not support outer checksums.
I understand your point, but currently, when TSO is enabled, the values for E810 tx_geneve_tso and tx_vxlan_tso_offload are indeed disabled. So, my validation result is based on the situation where tx_out_udp_csum_offload/tx_geneve_tso/tx_vxlan_tso_offload are disabled. However, your suggestion to explicitly disable them in the code is also reasonable, and I will make the necessary modifications accordingly.
ovs-vsctl get open . other_config
{bundle-idle-timeout="3600", dpdk-extra=" -a 0000:af:00.1 -a 0000:af:00.0", dpdk-init="true", dpdk-socket-mem="2048", n-handler-threads="1", pmd-cpu-mask="0xf", userspace-tso-enable="true", vlan-limit="0"}
ovs-vsctl get interface tun_port_p0 status
{bus_info="bus_name=pci, vendor_id=8086, device_id=159b", driver_name=net_ice, if_descr="DPDK 23.11.0 net_ice", if_type="6", link_speed="25Gbps", max_hash_mac_addrs="0", max_mac_addrs="64", max_rx_pktlen="1618", max_rx_queues="256", max_tx_queues="256", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="1", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="true", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
For X710, there is a similar issue to E810. Although tx_out_udp_csum_offload is true, I observed incorrect outer UDP checksums when inspecting packets at the receiving end. I have made some modifications and revalidated the issue, and it now appears to be resolved. I will submit a patch shortly.
From what I can see in the DPDK code, it appears that the i40e driver does not handle the outer UDP checksum logic. https://github.com/DPDK/dpdk/blob/main/drivers/net/i40e/i40e_rxtx.c#L301
@wangjun0728 thanks. Yeah, it looks like it just advertises the feature, but doesn't do anything about it... Could you open a bug on https://bugs.dpdk.org for both i40e and ice, if you didn't already?
@igsilya Thank you very much. I have filed a bug and will support tracking it. https://bugs.dpdk.org/show_bug.cgi?id=1406
@wangjun0728 Thanks!
For the issue where TCP does not work well while UDP works. It's still a bit puzzling, but maybe related to this: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/053015.html ?
Could you show the output of ovs-appctl dpctl/dump-flows
while TCP traffic is (not) flowing? Specifically, I'm interested in the flow that performs tnl_push
action.
@wangjun0728 Thanks!
For the issue where TCP does not work well while UDP works. It's still a bit puzzling, but maybe related to this: https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/053015.html ?
Could you show the output of
ovs-appctl dpctl/dump-flows
while TCP traffic is (not) flowing? Specifically, I'm interested in the flow that performstnl_push
action.
I think you are right. The discussion you posted may be related to the issue I encountered. Here is the flow information when I use iperf to send TCP traffic after enabling TSO. Additionally, I have disabled tx_geneve_tso_offload/tx_vxlan_tso_offload/tx_out_udp_csum_offload.
[root@compute]# ovs-appctl dpctl/dump-flows
flow-dump from pmd on cpu core: 3
recirc_id(0xf0),tunnel(tun_id=0x31,src=10.253.38.54,dst=10.253.38.55,geneve({}),flags(-df+csum+key)),in_port(4),ct_state(-new+est-rel+rpl-inv+trk),ct_label(0/0x1),packet_type(ns=0,id=0),eth(dst=0e:a0:1b:9e:ca:04/01:00:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:2390, bytes:138104, used:0.167s, flags:., actions:7
recirc_id(0),tunnel(tun_id=0x31,src=10.253.38.54,dst=10.253.38.55,geneve({class=0x102,type=0x80,len=4,0x60005}),flags(-df+csum+key)),in_port(4),skb_mark(0/0x4),ct_state(-trk),packet_type(ns=0,id=0),eth(src=0a:c8:e1:5c:84:0e,dst=0e:a0:1b:9e:ca:04/01:00:00:00:00:00),eth_type(0x0800),ipv4(src=10.0.0.5/128.0.0.0,proto=6,frag=no), packets:2390, bytes:138104, used:0.167s, flags:., actions:ct(zone=10),recirc(0xf0)
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=0e:a0:1b:9e:ca:04/01:00:00:00:00:00,dst=0a:c8:e1:5c:84:0e),eth_type(0x0800),ipv4(src=10.0.0.3/128.0.0.0,dst=10.0.0.5/128.0.0.0,proto=6,frag=no), packets:4611, bytes:7858514, used:0.167s, flags:P., actions:ct(zone=10),recirc(0xed)
recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=40:a6:b7:21:92:8c,dst=6c:fe:54:2f:7e:b0),eth_type(0x8100),vlan(vid=92,pcp=0),encap(eth_type(0x0800),ipv4(dst=10.253.38.55,proto=17,frag=no),udp(dst=6081)), packets:2390, bytes:286284, used:0.167s, actions:pop_vlan,tnl_pop(4)
recirc_id(0xed),in_port(7),skb_mark(0),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),packet_type(ns=0,id=0),eth(src=0e:a0:1b:9e:ca:04,dst=0a:c8:e1:5c:84:0e),eth_type(0x0800),ipv4(dst=10.0.0.5/255.255.255.252,tos=0/0x3,frag=no), packets:4611, bytes:7858514, used:0.167s, flags:P., actions:set(skb_mark(0x1)),tnl_push(tnl_port(4),header(size=58,type=5,eth(dst=40:a6:b7:21:92:8c,src=6c:fe:54:2f:7e:b0,dl_type=0x0800),ipv4(src=10.253.38.55,dst=10.253.38.54,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=6081,csum=0xffff),geneve(crit,vni=0x31,options({class=0x102,type=0x80,len=4,0x50006}))),out_port(1)),push_vlan(vid=92,pcp=0),lb_output(2)
flow-dump from pmd on cpu core: 2
recirc_id(0xed),in_port(7),skb_mark(0),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),packet_type(ns=0,id=0),eth(src=0e:a0:1b:9e:ca:04,dst=0a:c8:e1:5c:84:0e),eth_type(0x0800),ipv4(dst=10.0.0.5/255.255.255.252,tos=0/0x3,frag=no), packets:118, bytes:196172, used:0.435s, flags:P., actions:set(skb_mark(0x1)),tnl_push(tnl_port(4),header(size=58,type=5,eth(dst=40:a6:b7:21:92:8c,src=6c:fe:54:2f:7e:b0,dl_type=0x0800),ipv4(src=10.253.38.55,dst=10.253.38.54,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=6081,csum=0xffff),geneve(crit,vni=0x31,options({class=0x102,type=0x80,len=4,0x50006}))),out_port(1)),push_vlan(vid=92,pcp=0),lb_output(2)
recirc_id(0),in_port(7),packet_type(ns=0,id=0),eth(src=0e:a0:1b:9e:ca:04/01:00:00:00:00:00,dst=0a:c8:e1:5c:84:0e),eth_type(0x0800),ipv4(src=10.0.0.3/128.0.0.0,dst=10.0.0.5/128.0.0.0,proto=6,frag=no), packets:118, bytes:196172, used:0.435s, flags:P., actions:ct(zone=10),recirc(0xed)
recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=40:a6:b7:23:6a:90,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=91),encap(eth_type(0x0806),arp(sip=10.253.38.27,tip=10.253.38.27,op=1)), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=40:a6:b7:23:6a:90,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=91),encap(eth_type(0x0806),arp(sip=10.253.38.26,tip=10.253.38.26,op=1)), packets:0, bytes:0, used:never, actions:drop
recirc_id(0),in_port(3),packet_type(ns=0,id=0),eth(src=40:a6:b7:23:6a:90,dst=ff:ff:ff:ff:ff:ff),eth_type(0x8100),vlan(vid=91),encap(eth_type(0x0806),arp(sip=10.253.38.25,tip=10.253.38.25,op=1)), packets:0, bytes:0, used:never, actions:drop
[root@compute]# ovs-vsctl get interface tun_port_p0 status
{bus_info="bus_name=pci, vendor_id=8086, device_id=1572", driver_name=net_i40e, if_descr="DPDK 23.11.0 net_i40e", if_type="6", link_speed="10Gbps", max_hash_mac_addrs="0", max_mac_addrs="64", max_rx_pktlen="1618", max_rx_queues="320", max_tx_queues="320", max_vfs="0", max_vmdq_pools="64", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="0", port_no="0", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="true", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
The DPDK version is 22.11. Currently, it appears that the DPDK errors are occurring due to the new version's checksum offload. Mellanox network cards seem to be operating normally. However, both E810 and 82599 network cards are displaying different error messages.
E810: {bus_info="bus_name=pci, vendor_id=8086, device_id=159b", driver_name=net_ice, if_descr="DPDK 22.11.1 net_ice", if_type="6", link_speed="25Gbps", max_hash_mac_addrs="0", max_mac_addrs="64", max_rx_pktlen="1618", max_rx_queues="256", max_tx_queues="256", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="1", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="true", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}
error: 2024-03-04T10:57:01.102Z|00018|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-04T10:57:01.105Z|00019|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-04T10:57:01.113Z|00020|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-04T10:57:01.167Z|00021|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-04T10:57:01.278Z|00022|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event 2024-03-04T10:57:01.599Z|00023|dpdk|WARN|ice_interrupt_handler(): OICR: MDD event
82599: {bus_info="bus_name=pci, vendor_id=8086, device_id=10fb", driver_name=net_ixgbe, if_descr="DPDK 22.11.1 net_ixgbe", if_type="6", link_speed="10Gbps", max_hash_mac_addrs="4096", max_mac_addrs="127", max_rx_pktlen="1618", max_rx_queues="128", max_tx_queues="64", max_vfs="0", max_vmdq_pools="64", min_rx_bufsize="1024", n_rxq="2", n_txq="5", numa_id="0", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="false", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="true", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"
error: 2024-03-04T11:04:52.740Z|00384|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-04T11:04:54.449Z|00385|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-04T11:04:55.492Z|00386|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-04T11:04:55.592Z|00387|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported 2024-03-04T11:04:56.644Z|00388|netdev_dpdk|WARN|tun_port_p1: Output batch contains invalid packets. Only 0/1 are valid: Operation not supported
mellanox: {bus_info="bus_name=pci, vendor_id=15b3, device_id=1017", driver_name=mlx5_pci, if_descr="DPDK 22.11.1 mlx5_pci", if_type="6", link_speed="25Gbps", max_hash_mac_addrs="0", max_mac_addrs="128", max_rx_pktlen="1618", max_rx_queues="1024", max_tx_queues="1024", max_vfs="0", max_vmdq_pools="0", min_rx_bufsize="32", n_rxq="2", n_txq="5", numa_id="3", port_no="1", rx-steering=rss, rx_csum_offload="true", tx_geneve_tso_offload="false", tx_ip_csum_offload="true", tx_out_ip_csum_offload="true", tx_out_udp_csum_offload="false", tx_sctp_csum_offload="false", tx_tcp_csum_offload="true", tx_tcp_seg_offload="false", tx_udp_csum_offload="true", tx_vxlan_tso_offload="false"}