GNS3 / dynamips

Dynamips development
GNU General Public License v2.0
349 stars 94 forks source link

packet loss with multicast traffic #29

Closed anubisg1 closed 10 years ago

anubisg1 commented 10 years ago

i'm studying for CCIE, in particular to copy multicast topics...

while studying i faced a very weird issues where packets would be dropped "somewhere in the topology" and in that point, ping to the directly connected neighbor would timeout...

after deep investigations i found those two old topics (2008 and 2009)

http://7200emu.hacki.at/viewtopic.php?t=4537 http://ieoc.com/forums/t/3488.aspx

and this more recent in 2012

http://chasingmyccie.wordpress.com/2012/02/02/avoid-ip-multicast-in-your-gns3-labs

in particular what happen is that packets are lost somewhere and is impossible to ping even a directly connected router when the packet loss is isolated on a particular link

this issue is still present with dynamips 0.2.11


Related Topics:

http://forum.gns3.net/topic8508.html Patch: http://forum.gns3.net/post26454.html#p26454

flaviojs commented 10 years ago

Ohoo, I see people mention the gt96k hardware. It's one of the many places that has missing hardware functionality. There were so many places that I couldn't decide where to start, guess this settles it. ^^

This is gonna be my first go at the hardware emulation code, so don't expect results. =~~

flaviojs commented 10 years ago

System controllers from dev_gt.c used by the routers:

The other Galileo controllers probably have the same missing features. No idea about the routers that don't use these controllers, feel free to try them out. ;D

anubisg1 commented 10 years ago

Hello Flavio...

i'm using 3725 with GT96100-FE

if you tell me what you need to test for you i can give it a shot

flaviojs commented 10 years ago

Then can you try with different routers to check if they have the same problem? Try c3660 (GT64120A), c3620 or c3640 (GT64010) and c1700 or c2600 (none).

Even more helpful would be providing me with a simple GNS3 project that demonstrates the issue (instructions are appreciated). Right now i'm just reading the specs and checking for missing stuff in the source.

anubisg1 commented 10 years ago

you can download the project from here:

https://dl.dropboxusercontent.com/u/665924/multicast.zip

Open a console connection to R6 and an aux connection to R6 as well.

from one console to R6 ping 239.1.1.1 from the other ping 155.1.108.10

you will see that the unicast ping will start dropping .. and suddenly stop.

sto the ping to 155.1.108.10 and perform a traceroute to 155.1.108.10. Where the traceroute stops you have your problem there. Go on the router where the trace stop and you'll see that you cannot ping the router on the directly connected interface.

stop the multicast ping, and after few minutes things comes back to normal (until you start a multicast ping again)

flaviojs commented 10 years ago

Thanks, I used and adapted your project for these tests:

Conclusion, the cause was narrowed down to the use of the FastEthernet interfaces of GT96100-FE. Good. Now I can explore in peace the 549-page datasheet of GT-96100A, knowing that I'm on the right track. =)

flaviojs commented 10 years ago

From http://forum.gns3.net/post26234.html#p26234

Theory: maybe the OS is trying to throttle the traffic (to avoid congestion) by delaying packets?

flaviojs commented 10 years ago

Having a closer look at the configs, there is too much stuff mixed in together... I can't pinpoint the real problem like this. Can you make a simpler example that has the same problem? The less stuff enabled the better. (static routes preferred)

anubisg1 commented 10 years ago

Hello Flavio,

can you clarify what is creating you problems? the ospf configuration or the multicast one? (PIM Sparse Mode)

flaviojs commented 10 years ago

I need to reduce the project to a point where changing 1 thing makes the problem manifest itself or not. Then i can check the technology around that particular change.

In your example I get the failed pings, but I don't get a consistent breaking point (with traceroute). Can the number of routers be reduced? Can the type of connections be reduced? (router-switch, router-serial-router, router-ethernet-router) Can the number of used features be reduced? (ospf, egrip, loopback, other stuff that doesn't affect the outcome) There is just way too much stuff going on and some of them I never even used before so wouldn't know if it's malfunctioning. =~~

anubisg1 commented 10 years ago

I can't make it any easier than this one... https://dl.dropboxusercontent.com/u/665924/simple-multicast.zip

the multicast group is joined on R3 and is 239.1.1.1 there are only static routes, no dynamic routing... again, what you are looking for is triggered by the multicast traffic!

Andrea

flaviojs commented 10 years ago

Thank you, I can reproduce it in the simple project too. =)

anubisg1 commented 10 years ago

no problem, we really hope we can have this bug fixed so we can study multicast more easily :dancers:

flaviojs commented 10 years ago

Fixed by Peter Palúch in 79838a4edc7883645d7b8b57ddf98da9896d167f. =)