GNS3 / gns3-gui

GNS3 Graphical Network Simulator
http://www.gns3.com
GNU General Public License v3.0
2.1k stars 432 forks source link

[2.0.0b4] uBridge keeps crashing on trunk links #1867

Open bpozdena opened 7 years ago

bpozdena commented 7 years ago

Hi,

All my clouds kept on stopping forwarding traffic randomly. I then discovered it happens when the cloud-uBridge processes packet larger than 1500bytes.

I tested with uBridge uBridge 0.9.11, uBridge 0.9.10-test and some older ones too. None of them works.

As a work around, I lowered the MTU to 1470bytes on all servers passing traffic through the cloud (on ESXi vSwitches and on all VMs hosted on it). All has been working for over one hour now compare to a few seconds previously.

The easiest way to replicate it is to ping over the cloud normally and then with packet size over 1500bytes. You will see the cloud will stop forwarding traffic until the link is removed and placed back.

Tested on Windows 10 Home and Windows Server 2016 eval. Sairus

ghost commented 7 years ago

I have encountered the same problem when I sent a HTTP request from an IOSv/QEMU instance inside GNS3 to a Windows server outside the cloud. After a large packet was returned as a response I then lost communication across the link to the cloud. When I tried to delete the link to the cloud node I got this error message in the GUI console: _Error while deleting link: Could not send Dynamips command 'ethsw removenio "SW1" udp-e09ec3bd-0df1-4a66-b22a-8a9f6e9c7e1a' to 0.0.0.0:52740: Connection lost, process running: False

I have attached screenshots of my topology, switch config, and packet capture from this event.

I was not able to reproduce the problem when I replaced the built-in Ethernet switch with an IOSvL2 switch, so that is my workaround for now.

topology switch_config packet_capture

ghost commented 7 years ago

I did one more test, which was to change the link from sw1 to r1 to be an access port. I do not encounter the bug in that case, so it appears to be related to the dot1q setting.

grossmj commented 7 years ago

@sairuscz

I am suspecting the problem comes from the MTU on the adapter used by the GNS3 cloud. For instance here is what I have on my system.

C:\Users\GNS3>netsh interface ipv4 show interfaces

Idx     Met         MTU          State                Name
---  ----------  ----------  ------------  ---------------------------
  1          50  4294967295  connected     Loopback Pseudo-Interface 1
 15          10        1500  connected     Ethernet0
 26          40        1500  disconnected  Bluetooth Network Connection
 19          10        1500  connected     LOOPBACK
 33          20        1500  connected     VMware Network Adapter VMnet1
 13          20        1500  connected     VMware Network Adapter VMnet8
 10           0        1400  disconnected  Ethernet
 14          10        1500  connected     VirtualBox Host-Only Network

As a work around, I lowered the MTU to 1470bytes on all servers passing traffic through the cloud (on ESXi vSwitches and on all VMs hosted on it). All has been working for over one hour now compare to a few seconds previously.

Where and how have you lowered the MTU to 1470 bytes?

Thanks,

bpozdena commented 7 years ago

Brilliant catch Jeremy.

the command to change the MTU is netsh interface ipv4 set subinterface "Loopback Pseudo-Interface 1" mtu=1470 store=persistent

To verify the change use netsh interface ipv4 show subinterface

It may be enough to lower it only to 1500bytes. Let me test it and I will get back to you.

bpozdena commented 7 years ago

Unfortunately, MTU set on the loopback interfaces does not have any effect on the traffic that goes through the cloud/uBridge.

Once a single packet exceeds MTU of 1518bytes, the uBridge ether crashes completely, or slows down incredibly. When it slows down, there is an enormous amount of duplicate packets, re-transmissions, packets out of order, packet loss 99%...

For now I am just keeping MTU lowered on all other physical and virtual devices. The clouds crash anyway after a while.

This is basically the only major issue of GNS3 now. If this is fixed, GNS3 will become an awesome simulator!

julien-duponchelle commented 7 years ago

Can you try to replace your ubridge.exe by this one: https://github.com/GNS3/ubridge/releases/tag/v0.9.11

Replace also the cygwin DLL.

On Mon, Mar 6, 2017 at 1:30 PM sairuscz notifications@github.com wrote:

Unfortunately, MTU set on the loopback interfaces does not have any effect on the traffic that goes through the cloud/uBridge.

Once a single packet exceeds MTU of 1518bytes, the uBridge ether crashes completely, or slows down incredibly. When it slows down, there is an enormous amount of duplicate packets, re-transmissions, packets out of order, packet loss 99%...

For now I am just keeping MTU lowered on all other physical and virtual devices. The clouds crash anyway after a while.

This is basically the only major issue of GNS3 now. If this is fixed, GNS3 will become an awesome simulator!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/GNS3/gns3-gui/issues/1867#issuecomment-284383551, or mute the thread https://github.com/notifications/unsubscribe-auth/AAVFXZwE87CVnHK4hiAtFNtKLqgS8lahks5ri_xugaJpZM4MFasl .

bpozdena commented 7 years ago

Hi Julien, I already tested 0.9.11 as described in the original post. Unfortunately it has the same problem.

julien-duponchelle commented 7 years ago

Oh sorry

On Tue, Mar 7, 2017 at 11:07 AM sairuscz notifications@github.com wrote:

Hi Julien, I already tested 0.9.11 as described in the original post. Unfortunately it has the same problem.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/GNS3/gns3-gui/issues/1867#issuecomment-284677516, or mute the thread https://github.com/notifications/unsubscribe-auth/AAVFXVljNViJXI7wadkeg0krIesBi49Qks5rjSxPgaJpZM4MFasl .

bpozdena commented 7 years ago

Still the same on RC1

grossmj commented 7 years ago

@sairuscz how do you reproduce the issue from within a single VM? I need the simplest setup to quickly reproduce the issue.

Also, there is a new feature since RC1 which allows you to right click on a cloud and pick "Show in File Manager". This will show you a uBridge log file, maybe there will be some debug information that will help us.

Thanks! :)

grossmj commented 7 years ago

Nevermind, I can reproduce the issue between two VMs connected via a cloud.

grossmj commented 7 years ago

The problem comes from winpcap and the pcap_sendpacket() function which cannot send packet larger than 1500bytes. The function fails with "send error: PacketSendPacket failed".

grossmj commented 7 years ago

pcap_sendpacket() fails because an interface does not support a packet size larger than 1500 bytes.

For instance, MYLOOPBACK here has a maximum packet size of 1514 (including headers):

maximum_packet_size

Changing the MTU doesn't affect that limit. It seems to be a (hardcoded?) limit in the adapter driver and I didn't find a way to change this :(

maximum_packet_size_2

The only way to go around would be to slice the data we want to send in uBridge but I don't really like that idea :(

grossmj commented 7 years ago

Looks like we are not the only ones facing that issue: https://github.com/chmorgan/sharppcap/issues/21