Open autumn0207 opened 5 years ago
@moshe010 can you please take a look.
As I understand, the CX5 driver doesn't yet support connection tracking and NATing yet, so such flows will not be offloaded to the NIC.
@autumn0207, You need connection tracking tc offload in ovs [1] that patches are still under review.
[1] https://www.mail-archive.com/ovs-dev@openvswitch.org/msg34124.html
@moshe010 @girishmg thank you very much. But there is another error in ovs logs:
2019-09-23T07:31:45.257Z|00006|dpif_netlink(handler66)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:31:45.635Z|00001|dpif_netlink(handler67)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:31:47.199Z|00005|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:32:03.707Z|00002|dpif_netlink(handler67)|ERR|failed to offload flow: Invalid argument: enp216s0_0 2019-09-23T07:32:04.250Z|00006|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: enp216s0_0 2019-09-23T07:32:05.252Z|00007|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: enp216s0_0
enp216s0_0 and 621b9db0ed0b6aa are vf repos
@autumn0207, It hard to say from the log. If connectivity is working but not offloaded, I prefer that all the pieces in ovs and kernel will be merged first and then we can continue. Is the problem that it is not offloaded? Does connectivity between pods works?
@moshe010 Yes, the connectivity can working normally, i just dont understand why any flows can not be offloaded
Hello, I have encountered the same problem. How did you solve it?
@BntumBle, The ovs needs connection tracking offload to work. in ovs it already merged [1] in ovs 2.13. The kernel is still work in progress, so once all patches we land in the kernel offload will work
[1] - https://github.com/openvswitch/ovs/commit/576126a931cdf96d43443916d922462c7a16e350
@moshe010 so you mean that the ovs offload has not been successful yet, and some kernel changes are needed?
@moshe010 I use the ovs 2.9.5 to offload a simple forward flow , but I failed and the ovs-vswitchd log shows error "failed to offload flow: Invalid argument". Does this need connection tracking offload?
@BntumBle, you need ovs 2.13 and from the kernel I don't think all the patches merged. but just to do tc software (no offload, byt goes via tc) you need kernel 5.5.9 latest upstream
@moshe010 but I am following this guide to configure Open vSwitch Hardware offload https://help.netronome.com/support/solutions/articles/36000081172-agilio-open-vswitch-tc-user-guide#document-07_Using_openvswitch it says kernel 4.15- is ok .
it ok for very basic offloads like push/pop vlan or vxlan ecap/decp. ovn-kubernetes relay on connection tracking feature which offload introduce much later on
@moshe010 Thank you for your reply , Then if I just perform a basic offloads(just a one port in and one port out), does it necessarily to use ovs 2.13, ovs 2.9.5 is ok?
yes but you need a basic CNI such [1] see [2]
[1] - https://github.com/kubevirt/ovs-cni [2] - https://github.com/kubevirt/ovs-cni/blob/master/docs/ovs-offload.md
@autumn0207
2019-09-23T07:31:45.257Z|00006|dpif_netlink(handler66)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:31:45.635Z|00001|dpif_netlink(handler67)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:31:47.199Z|00005|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: 621b9db0ed0b6aa 2019-09-23T07:32:03.707Z|00002|dpif_netlink(handler67)|ERR|failed to offload flow: Invalid argument: enp216s0_0 2019-09-23T07:32:04.250Z|00006|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: enp216s0_0 2019-09-23T07:32:05.252Z|00007|dpif_netlink(handler71)|ERR|failed to offload flow: Invalid argument: enp216s0_0
enp216s0_0 and 621b9db0ed0b6aa are vf repos
Have you solved this problem? I build a enviroment as the openstack reference ,but also I meet the same error as yours. My linux is centos 7.8, kernel is 3.10.0-957.el7.x86_64,ovs 2.11 ,NIC is Mellanox cx5
offload referenece: https://docs.openstack.org/neutron/queens/admin/config-ovs-offload.html
you need connection tracking support. It in kernel 5.7 or above and ovs 2.13 see https://github.com/ovn-org/ovn-kubernetes/blob/master/docs/ovs_offload.md
@moshe010 Thanks a lot for your reply.I am not clear whether the kenerl >=4.13 is necessary, because Someone told me they have used the Centos 7.6 with linux kernal 3.10.0-957.el7.x86_64,and ovs 2.11 to offload vxlan encap and decap to the Mellanox CX5 successfully. And also there are some Mellanox support Applications notes give an successful example with Centos7.5 : https://www.mellanox.com/related-docs/prod_software/Mellanox_Support_for_TripleO_Rocky_Application_Notes_v1.1.pdf So I am a little confused with these references.
The errors in ovs-vswitd.log: 2020-12-28T09:34:17.138Z|01842|dpif_netlink(handler131)|DBG|system@ovs-system: put[create] ufid:2da53e35-d5ad-4245-a4b7-af99af4c7f52 recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(5),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),eth(src=fa:16:3e:e3:fa:16,dst=fa:16:3e:66:c1:4f),eth_type(0x0806),arp(sip=192.168.1.4/0.0.0.0,tip=192.168.1.10/0.0.0.0,op=2/0,sha=fa:16:3e:e3:fa:16/00:00:00:00:00:00,tha=fa:16:3e:66:c1:4f/00:00:00:00:00:00), actions:set(tunnel(tun_id=0x17,src=192.168.20.242,dst=192.168.20.241,ttl=64,tp_dst=4789,flags(df|key))),4 2020-12-28T09:34:17.138Z|01843|dpif_netlink(handler131)|ERR|failed to offload flow: Invalid argument: eth0
The ct values in flow are all zero.
kernel 4.13 was support for just vxlan without security groups (security groups uses connection tracking). ovn kubernetes need geneve and connection tracking to work (there is no way to disable it like it is in openstack). The flow that you are sawing is arp which is not offloaded anyway so it ok it failed to offloaded it. with connection tracking we can offload only tcp and udp traffic. ICMP will not be offloaded as well.
ovs log:
2019-09-23T03:14:48.824Z|00035|dpif_netlink(handler2)|ERR|failed to offload flow: Operation not supported: 621b9db0ed0b6aa 2019-09-23T03:14:56.222Z|00069|dpif_netlink(revalidator11)|ERR|Dropped 2 log messages in last 8 seconds (most recently, 8 seconds ago) due to excessive rate 2019-09-23T03:14:56.222Z|00070|dpif_netlink(revalidator11)|ERR|failed to offload flow: Operation not supported: 621b9db0ed0b6aa
dump-flows: 2019-09-23T03:15:37Z|00001|dpif_netlink|INFO|The kernel module does not support meters. recirc_id(0x18),in_port(3),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(dst=0a:00:00:a8:00:05),eth_type(0x0800),ipv4(dst=192.168.0.4,frag=no), packets:768, bytes:75264, used:0.046s, actions:3 recirc_id(0x1c),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),eth(src=0a:00:00:a8:00:05,dst=00:00:00:e2:93:e3),eth_type(0x0800),ipv4(src=192.168.0.4,dst=192.168.0.1,proto=1,ttl=64,frag=no),icmp(type=8,code=0), packets:3, bytes:294, used:0.047s, actions:userspace(pid=4294149776,slow_path(action)) recirc_id(0),in_port(3),ct_state(-new-est-rel-rpl-inv-trk),ct_label(0/0x1),eth(src=0a:00:00:a8:00:05,dst=00:00:00:e2:93:e3),eth_type(0x0806),arp(sip=192.168.0.4,tip=192.168.0.1,op=1/0xff,sha=0a:00:00:a8:00:05,tha=00:00:00:00:00:00), packets:33, bytes:1980, used:4.045s, actions:userspace(pid=4294149776,slow_path(action)) recirc_id(0x19),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), packets:767, bytes:75166, used:0.048s, actions:ct(zone=1,nat),recirc(0x1a) recirc_id(0x1a),in_port(3),eth(dst=00:00:00:e2:93:e3),eth_type(0x0800),ipv4(proto=1,frag=no),icmp(type=8/0xf8), packets:3, bytes:294, used:0.048s, actions:ct(zone=1),recirc(0x1b) recirc_id(0x1b),in_port(3),ct_state(-new+est-rel-rpl-inv+trk),ct_label(0/0x1),eth(),eth_type(0x0800),ipv4(src=192.168.0.4/255.255.255.252,frag=no), packets:767, bytes:75166, used:0.048s, actions:ct(zone=1,nat),recirc(0x1c) recirc_id(0),in_port(3),eth(src=0a:00:00:a8:00:05),eth_type(0x0800),ipv4(src=192.168.0.4,dst=128.0.0.0/128.0.0.0,proto=1,frag=no),icmp(type=8/0xf8), packets:3, bytes:294, used:0.048s, actions:ct(zone=1),recirc(0x19)
hw: mellanox cx5 os: centos 7.6 ovs version: 2.12