jagt / clumsy

clumsy makes your network condition on Windows significantly worse, but in a controlled and interactive manner.
http://jagt.github.io/clumsy/
Other
5.17k stars 512 forks source link

High internal latency and strange retransmissions when using Lag function #87

Open theultramage opened 3 years ago

theultramage commented 3 years ago

I wanted to test how a piece of client-server software performs under various latency conditions. I used two idle adjacent hosts on a 1GbE network. I configured Clumsy 0.3 and Wireshark on the client (Server 2012r2), started capture and lag simulation, and performed one client action that results in 200 sequential request/response packets. Under normal conditions, these requests have <0.1ms duration and <0.1ms delay between consecutive requests.

When Clumsy's Lag function is enabled, even with delay=0 to just get a baseline measurement, I'm seeing that the consecutive requests are now 80ms apart, implying a 40ms delay on the outbound and inbound direction. This makes any sort of meaningful measurement impossible.

In addition, inbetween my repeated tests (which involved restarting Clumsy and Wireshark), I've observed multiple times that Clumsy appears to malfunction, letting outbound packets through immediately (and instantly receiving an ACK) but then also sending a 'spurious retransmission' of the same packet 20ms later. I've also observed Clumsy switch from this malfunctioning state to normal mid-capture.

jagt commented 3 years ago

Hi, I believe it's that clumsy/WinDivert has a builtin overhead on latency. The lag value is applied additionally on top of that. We have a older issue on this https://github.com/jagt/clumsy/issues/4

In addition, inbetween my repeated tests (which involved restarting Clumsy and Wireshark), I've observed multiple times that Clumsy appears to malfunction, letting outbound packets through immediately (and instantly receiving an ACK) but then also sending a 'spurious retransmission' of the same packet 20ms later. I've also observed Clumsy switch from this malfunctioning state to normal mid-capture.

Do you mean when lag is enabled some packages are duplicated 20ms late? If that's easy to reproduce I think it can be observable from the scripts in the scripts folder, the console lines one that's demonstrated on here: <jagt.github.io/clumsy/>

Cheers.

theultramage commented 3 years ago

Regarding #4 from 2014, the author is asking why Clumsy was still causing significant delays even when no functions were enabled. In my testing of 0.3, I've observed that in this case the performance is identical, so that seems to have been fixed(?). They also wondered why a 20ms delay was causing uch longer browser load times, not realizing it's cumulative 20ms for every packet that cannot be sent in parallel / asynchronously. So if their webpage includes 10 js files, and if by default they're loaded sequentially, then that's 400ms right there. So unless I'm mistaken, that one has been resolved.

theultramage commented 3 years ago

Regarding the processing latency, I don't really know. It could be a combination of factors. I took a slight peek at the code, and noticed that TIMER_RESOLUTION is '4'. This immediately introduces 4ms of jitter to all time-based events. In lag.c the line currentTime > pac->timestamp + lagTime is maybe supposed to have a >= and might explain why a lag of '0' doesn't behave as passthru. For other stuff, I don't know. Maybe the Iup UI library's main loop uses the default windows 16ms timer precision. Maybe there are Sleep calls somewhere. Maybe it's windivert doing it. Hard to tell from this high level.

NiftyliuS commented 2 years ago

Any update on this? Using clumsy 0.3 RC4 - when setting LAG to 1ms im getting spikes of up to 50ms.

yonojoy commented 8 months ago

@jagt I too can still reproduce this issue in clumsy 0.3 We have fast internal ethernet with ping times <1 ms (with ping localhost -4)

Antwort von 127.0.0.1: Bytes=32 Zeit<1ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=1ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit<1ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit<1ms TTL=128

I would like to debug slow internal LAN conditions with ping times in the range of 3...4 ms. So I used clumsy and set a lag time of 2 ms, expecting ping times of about 4 ms (2+2). But what I got is this:

Antwort von 127.0.0.1: Bytes=32 Zeit=67ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=65ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=58ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=49ms TTL=128

For 100 ms (expecting 200 ms ping times) I got this:

Antwort von 127.0.0.1: Bytes=32 Zeit=245ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=253ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=258ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=257ms TTL=128

and for 1000 ms lag I got this:

Antwort von 127.0.0.1: Bytes=32 Zeit=2028ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=2030ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=2026ms TTL=128
Antwort von 127.0.0.1: Bytes=32 Zeit=2020ms TTL=128

So it seems, that an fixed amount of about 25...30 ms is added to the entered lag time. I understand, that there is an builtin overhead, but more than 20 ms? Why?

Falkon009 commented 4 months ago

Is there a tool that is similar to clumsy that can give me the effects i would like from clumsy, but its not clumsy?