Closed radhikaniranjan closed 12 years ago
Is this with the latest merge I pushed today? I will look into this if so.
Actually, if it is, please revert back to the pre-merge today, as I may not get to this in detail until tomorrow. But any additional information, especially with regards to what specific version would be great! Thanks!
Yes, it does seem to be with the latest merge (I had no issues with my vxlan checkout yesterday, as I mentioned the only problem was poor throughput with the vxlan tunnels). I have a remote testbed and because of its idiosyncrasies I only have a screenshot of the kernel panic messages. I only recognize that in most cases udp_queue_rcv_skb calls vxlan_rcv in the stack trace.
I have a fedora machine that is running 3.3.4-5.fc17.x86_64 and an ubuntu machine that is running 3.2.0-29-generic, both of which have crashed once I checked out today's code from the vxlan branch.
Do let me know if I can provide further information (I am a newbie and am still learning).
The first few entries of my git history show; git hist
Thanks much for the prompt responses!
I think I see what the problem is here. I'll fix this up and hopefully push the fix by tomorrow. Thank you for reporting this error!
Thanks much! Look forward to testing the fix.
I just pushed what I believe is the fix for this issue up. Please pull it down and let me know if it addresses your crash. My test rig is having trouble, but I wanted to get this out for you to try. I'll let you know if I hit other issues once my test rig is back up.
Thanks much for the quick resolution. My testbed is currently being used by someone else, but I promise to try it out as soon as it becomes available (hopefully by tomorrow). I will be grateful if you could also enlighten me as to what is the maximum throughput I should see when I use the vxlan tunnel. I have seen one blogpost where they got 8.5 Gbps, but the maximum I've gotten so far is 5.65 Gbps. I've tried various things to up the throughput but without success. The baseline throughput I get between two ovsbridges on two remote machines is 9.9Gbps without vxlan tunneling.
I do understand at the same time that this might not be something thats on top of your list. I just wanted to put this question up here just in case you'd have some insights.
I'll get back to you with my test results. Once again, thanks much for such a quick resolution and all the help!
I have been attempting to do the very same thing recently, but I have not encountered any kernel panics so far. This must be due to the fact that I only added the VXLAN support to OVS-1.7.1. I have been messing around with a lot of tunneling protocols such as GRE, CAPWAP, STT, IPsec and more recently NVGRE and GRE64. Kyle's implementation of VXLAN has been superb!
Radhika, could you point me to the blog where they got 8.5 Gbps? I plan on doing benchmarking of these protocols once I am done exploring them further, so any prior generated numbers would help me.
Hi Farrukh, thats great. Here is the link : http://networkstatic.net/configuring-vxlan-and-gre-tunnels-on-openvswitch/ They get around 8.7Gbps. They use Kyle's vxlan branch.
Thanks for sharing. Yes, that is Brent's blog. I had almost forgotten about that post. I will definitely share my benchmarking results and set up once I get to that part. I have been working on STT over IPsec support recently. Anyways, I appreciate the reply. Regards.
Thanks much again for the fast resolution of the Kernel panic. I finished testing the branch today, and sure enough I dont see the kernel panic anymore.
The throughput is still at 6.x Gbps:
Client connecting to 10.0.85.3, TCP port 5001
[ 3] local 10.0.101.3 port 51781 connected with 10.0.85.3 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 7.46 GBytes 6.41 Gbits/sec
I am using an MTU of 9000. Do let me know if there is any other setting that I can use to up the throughput to 9.x Gbps. As I mentioned, without vxlan, ovsbridge gives 9.89 Gbps. Please let me know if this is better discussed as a separate issue.
Thanks much again Kyle for the quick help and resolution!
Glad it worked! Would you mind posting your configuration for me? The output of "ovs-vsctl show" and "ifconfig -a" would be useful. I want to compare it to one of my test setups. Thanks!
Thanks much for looking into this!
My set up is as follows: OVS (10.0.85.3) <-----VXLAN ----PHYLINK (15.0.85.3) ===============PHYLINK(15.0.101.2)---VXLAN-----> OVS(10.0.101.2) PHYLINK is my 10Gbps physical link.
Here is the ovs configuration
Host 1: ovs-vsctl show 9399da11-e2ae-49d5-b5c8-08c6864ad7ab Bridge ovsbr Port "vx1" Interface "vx1" type: vxlan options: {remote_ip="15.0.101.3"} Port ovsbr Interface ovsbr
Host 2: ovs-vsctl show 94389297-4998-4960-b120-a83f4f2cc4d1 Bridge ovsbr Port "vx1" Interface "vx1" type: vxlan options: {remote_ip="15.0.85.3"} Port ovsbr Interface ovsbr
I have a lot of interfaces, therefore showing only the relevant ones:
Host 1:
ovsbr: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 10.0.85.3 netmask 255.255.0.0 broadcast 10.0.255.255 inet6 fe80::f0f4:67ff:fe89:2348 prefixlen 64 scopeid 0x20 ether f2:f4:67:89:23:48 txqueuelen 0 (Ethernet) RX packets 6076890 bytes 51004101244 (47.5 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1941275 bytes 6326190860 (5.8 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
p1p1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 9000 inet 15.0.85.3 netmask 255.255.0.0 broadcast 15.0.255.255 inet6 fe80::92e2:baff:fe26:82d4 prefixlen 64 scopeid 0x20 ether 90:e2:ba:26:82:d4 txqueuelen 1000 (Ethernet) RX packets 5584777 bytes 49719271982 (46.3 GiB) RX errors 0 dropped 906 overruns 0 frame 0 TX packets 1729674 bytes 200678713 (191.3 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Host 2: ovsbr Link encap:Ethernet HWaddr 16:03:b5:06:56:40 inet addr:10.0.101.3 Bcast:10.0.255.255 Mask:255.255.0.0 inet6 addr: fe80::260:ddff:fe46:515f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:8727829 errors:0 dropped:12 overruns:0 frame:0 TX packets:7412586 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:14258252105 (14.2 GB) TX bytes:156935144728 (156.9 GB)
eth5 Link encap:Ethernet HWaddr 00:60:dd:46:51:5f inet addr:15.0.101.3 Bcast:15.0.255.255 Mask:255.255.0.0 inet6 addr: fe80::260:ddff:fe46:515f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1 RX packets:56992363 errors:0 dropped:0 overruns:0 frame:0 TX packets:69611348 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:17908270611 (17.9 GB) TX bytes:526649690661 (526.6 GB) Interrupt:77
Do let me know if you need more information. And thank you so much for looking into this!
Hi Kyle, I was wondering if there was some other information I could supply here to help debug this performance issue. Do let me know.
Nothing else right now. Would you mind opening a new issue, and copying the information into that issue? In this way, we can separate this from the issue this thread was originally about.
Also, a lot of tunneling changes are coming down from upstream OVS, so I will be focused on making those work with the VXLAN branch here early next week, thus I may not get to the performance issue until then. Just an FYI.
Thanks for letting me know. Just opened another issue. Thanks much for the quick responses and help!
I must thank you first for providing an implementation of vxlan for ovs.
The latest merge to branch vxlan isnt going very well with my test set up. Every time I use the vxlan tunnel, I see machines going down with kernel panic.
Earlier to the merge, vxlan tunnels worked for me, but had poor performance, giving me 5.65Gbps on a 10Gbps link with mtu 9000.