tohojo / flent

The FLExible Network Tester.
https://flent.org
Other
431 stars 77 forks source link

packet loss stats #106

Closed heistp closed 6 years ago

heistp commented 7 years ago

Loss stats and jitter are listed in the RRUL spec (https://www.bufferbloat.net/projects/bloat/wiki/RRUL_Spec/) but not available in Flent. I'd like to be able to see packet loss to compare the drop decisions made by different qdiscs, particularly on UDP flows.

It looks like this is actually a limitation of netperf, as I don't see packet loss available in the UDP_RR test results. And I understand that getting TCP loss could be challenging as we have to get it from the OS somehow or capture packets, but isn't there a different utility that could be used for the UDP flows that would measure both RTT and packet loss? If not, perhaps one could be written. :) I remember now that Dave was starting a twd project a while back, but it ended up being a bridge too far. What I'm thinking of is probably simpler, but I don't know if it's enough. UDP packets could be sent from each end (of a certain size, at a certain rate, TBD) with sequence numbers and timestamps, and the receiver could both count how many it didn't receive and send a response packet back to the client, so you have both requests and responses being sent and received from each end. One and two way delay could potentially be measured.

Suggestions?

heistp commented 7 years ago

Actually one suggestion is D-ITG, which is already used for the VoIP tests. Just noticed that it can do both both one-way and RTT tests, and return packet loss. That could possibly be used instead of netperf for the UDP flows.

dtaht commented 7 years ago

Yes, twd was a bridge too far at the time. I wanted something that was reliably realtime and capable of doing gigE well, in particular, and writing twd in C was too hard. Now things have evolved a bit in these worlds (netmap for example), and perhaps me taking another tack entirely ( https://github.com/dtaht/libv6/blob/master/erm/doc/philosophy.org ) will yield results... eventually.

It still seems like punting the whole tcp (or quic?) stack to userspace, using raw sockets, or leveraging bpf (the last seems to have promise)... would get me to the extreme I wanted.

heistp commented 7 years ago

Wow, that sounds very high end. That (especially your comment about a userspace TCP stack) triggered a couple of thoughts I've been having as I'm completing my second round of point-to-point WiFi tests:

1) With my Flent runs, I'm testing what would happen if clients were connected directly to the backhaul at the backhaul's maximum rate, but it's really that the client connections are the bottleneck and several slower, varying rate clients come together to eventually, and sometimes, saturate the backhaul links. I think simulating this may produce a different response from AQM than straight-on saturating flows (and perhaps make AQM look even better). A more accurate test would respect this, but then, as you suggested, either the test itself would need a TCP stack in it or the client flows would need to go through virtual interfaces (maybe with netem? I don't know how else) to simulate changing rates, latency or maybe loss for the individual flows that feed the RRUL test. This is, for me, a bridge too far right now, but it makes me hope that the results I'm producing are still useful somehow!

2) The original spirit (and spec) of the RRUL test sounded like it envisioned some sort of "metric" summarizing the response of the system being tested under load. After spending many hours putting together my second round of Flent results, I yearn for a mostly automatic test that would produce relevant results without having to configure enumerations of individual tests. It doesn't need to be a single metric, but a minimal set of metrics representing response under load. I'm not sure if this can be done, but...

For one, I do know that rig setups can be extremely specific and variable (for me, it's all about Wi-Fi backhaul, which is vastly different from other setups), so I'm not proposing something that varies rig setups automatically, but maybe there could be an automatic test of sorts, after rig setup has occurred somehow, that runs in phases and ramps up until some limit. Two possible phases could be:

A) "no load" stuff that summarizes physical link characteristics (useful both for CPE devices with low numbers of users, or for just understanding the basics of backhaul and router links):

B) response under load (useful for loaded CPE devices, and at higher connection counts for backhaul and routers):

I don't mean to start summarizing the RRUL spec! So I'll stop, I only meant that maybe the process for testing RRUL could only have a minimal set of parameters, and the test program could automate the process of producing relevant results.

The only parameters to specify, in case the user wants to, might be "how far and how fast" (meaning for example how many simultaneous flows to go up to and how quickly to get there), which could be calculated based on either the results from phase A, "what it's probably capable of" or estimated during phase B based on "how it's going". The "how far to take the test" could be specified in case someone wants to push a link well beyond its limits, or wants to stop well short of them to get something quick.

Maybe this could just be a Flent "automatic" test?

So I know there's a place for a highly configurable "hard-core" tester that can hit 10 GigE and product microsecond-level accuracy, but I think there may also be a place for such an automated, "good enough for many" test.

PS- Golang 2.8 was released with GC improvements that bring pauses to "usually under 100 microseconds and often as low as 10 microseconds". I know that still might not be good enough for some tests, especially 10 GigE or microsecond-sensitive results (I'm starting to look at microseconds for VoIP test as well), but it's getting better.

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

Actually one suggestion is D-ITG, which is already used for the VoIP tests. Just noticed that it can do both both one-way and RTT tests, and return packet loss. That could possibly be used instead of netperf for the UDP flows.

Yeah, we're missing tools to do UDP packet loss statistics, unfortunately. D-ITG is a possibility, but it is a PITA to set up, and not suitable to run over the public internet...

-Toke

heistp commented 7 years ago

I noticed that. The VoIP tests were a little painful to get working. My Mac Mini G4 also had really bad clock drift, which at first produced some beautiful but useless delay curves. "adjtimex --tick 10029" got the system clock close enough so that ntp would agree to do the rest. Still, one-way delay can be off by up to a millisecond or so, depending on how ntp is feeling at the moment.

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

I noticed that. The VoIP tests were a little painful to get working. My Mac Mini G4 also had really bad clock drift, which at first produced some beautiful but useless delay curves. "adjtimex --tick 10029" got the system clock close enough so that ntp would agree to do the rest. Still, one-way delay can be off by up to a millisecond or so, depending on how ntp is feeling at the moment.

Yeah, I run PTP in my testbed to get around that.

For most cases, though, having a simple ping-like (i.e. isochronous) back-and-forth UDP RTT measurement would be fine... Unfortunately, I haven't yet found a tool that will do that...

-Toke

heistp commented 7 years ago

Thanks, I might try PTP, didn't know about it.

Anything depending on the old xinetd echo service is probably out, right? I guess you'd want a small, native standalone client and server? It's not as easy to find as I thought it would be.

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

Thanks, I might try PTP, didn't know about it.

Anything depending on the old xinetd echo service is probably out, right? I guess you'd want a small, native standalone client and server? It's not as easy to find as I thought it would be.

Main requirement is that the client can be convinced to timestamp its output and run for a pre-defined time interval. Accuracy in timing is a bonus... ;)

-Toke

heistp commented 7 years ago

Ok, and near as I can tell netperf UDP_RR sends a packet, waits for a response then sends another without any delay. If packets are lost, the test apparently stops, although it at least resumes after I built and installed 2.7.0 from source (after your tip in an email a while back).

I would think that, instead of stopping after not receiving a response, it should send another packet after some delay so the test doesn't stop. Perhaps the delay could be around 5x current mean RTT (maybe within the last 5x mean RTT window of time also, so it adapts to changes). That would need testing.

I'm surprised that the URP_RR test is that aggressive actually, that it sends continuously instead of at a fixed rate. It means that your UDP flows are in continuous competition with one another, as well as the TCP flows, whereas something like VoIP rather sends at a fixed rate. Perhaps that's what you want for the benchmark.

So I'll write if I find anything...

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

Ok, and near as I can tell netperf UDP_RR sends a packet, waits for a response then sends another without any delay. If packets are lost, the test apparently stops, although it at least resumes after I built and installed 2.7.0 from source (after your tip in an email a while back).

I would think that, instead of stopping after not receiving a response, it should send another packet after some delay so the test doesn't stop. Perhaps the delay could be around 5x current mean RTT (maybe within the last 5x mean RTT window of time also, so it adapts to changes). That would need testing.

I'm surprised that the URP_RR test is that aggressive actually, that it sends continuously instead of at a fixed rate. It means that your UDP flows are in continuous competition with one another, as well as the TCP flows, whereas something like VoIP rather sends at a fixed rate. Perhaps that's what you want for the benchmark.

Exactly. The ping-pong means that the rate consumed by the measurement flow varies with the RTT (so fix bufferbloat, you'll lose bandwidth as far as that test is concerned). Also, sine netperf only reports number of successful back-and-forth transactions (which Flent then turns into an RTT measure), a hickup turns into a very high RTT value, even with the restart behaviour.

So yeah, fixed (isochronous) rate, similar to 'ping' is what we need.

-Toke

heistp commented 7 years ago

Thanks, now I see where twd was headed and why. :)

dtaht commented 7 years ago

one thing that I increasing think is worth doing is adding an ipv6 timestamp header type. there are internet drafts on this, and it wouldn't break on local connnectons.

On 3/30/17 3:47 AM, Toke Høiland-Jørgensen wrote:

Pete Heist notifications@github.com writes:

Thanks, I might try PTP, didn't know about it.

Anything depending on the old xinetd echo service is probably out, right? I guess you'd want a small, native standalone client and server? It's not as easy to find as I thought it would be.

Main requirement is that the client can be convinced to timestamp its output and run for a pre-defined time interval. Accuracy in timing is a bonus... ;)

-Toke

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tohojo/flent/issues/106#issuecomment-290375010, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGoijbqICqJp8t8V-qIzcbKVhhImzSwks5rq4hBgaJpZM4MpJAm.

heistp commented 7 years ago

I wrote a quick mockup in Go to see what's possible. Here's pinging localhost for 200 packets with standard 'ping':

200 packets transmitted, 200 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.044/0.150/0.224/0.026 ms

And using my 'rrperf' mockup, sending and echoing 200 UDP packets with nanosecond timestamps to localhost:

Iter 199, avg 0.388392 ms, min 0.096244 ms, max 0.527164 ms, stddev 0.075881 ms

Summary:

Do you think these stats are within the realm of acceptability for local traffic, and would you use this at all from Flent? If so, I could complete a latency (and throughput for that matter) tester pretty quickly in Go, that outputs results say, to JSON.

Basically it could just run multiple isochronous RTT tests simultaneously, specifying packet size, spacing and diffserv marking for each, along with multiple TCP flows, specifying direction and diffserv marking. As for results, I suppose it would have periodic samples from each flow and totals at the end. For the UDP flows, I could have packet loss and RTT, but not OWD, for now (maybe later).

I don't know what extra features are needed from netperf, but I suspect there can be more detail. :)

Notes / Caveats:

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

I wrote a quick mockup in Go to see what's possible. Here's pinging localhost for 200 packets with standard 'ping':


200 packets transmitted, 200 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.044/0.150/0.224/0.026 ms

And using my 'rrperf' mockup, sending and echoing 200 UDP packets with nanosecond timestamps to localhost:


Iter 199, avg 0.388392 ms, min 0.096244 ms, max 0.527164 ms, stddev 0.075881 ms

Summary:

  • mean RTT is 150 microseconds for ping/ICMP, 388 microseconds for rrperf/UDP
  • stddev is 26 microseconds for ping/ICMP, 76 microseconds for rrperf/UDP

Do you think these stats are within the realm of acceptability for local traffic, and would you use this at all from Flent? If so, I could complete a latency (and throughput for that matter) tester pretty quickly in Go, that outputs results say, to JSON.

Sure! The latency test is the most pressing, so far netperf works quite well for throughput (and has a ton of features that it would take some time to replicate fully).

What does the server look like, and is it safe to expose to the internet? :)

Basically it could just run multiple isochronous RTT tests simultaneously, specifying packet size, spacing and diffserv marking for each, along with multiple TCP flows, specifying direction and diffserv marking. As for results, I suppose it would have periodic samples from each flow and totals at the end.

Actually, having Flent run multiple instances of the tool is probably easier than having to split the output of one instance into multiple data series.

-Toke

dtaht commented 7 years ago

Toke Høiland-Jørgensen notifications@github.com writes:

It's a three way handshake that's rather needed before inflicting it on the internet.

heistp commented 7 years ago

Ok, so I'll start with a latency only, single RTT test then. Keep things simple!

The client and server are separate, like netperf, as the server might end up a little smaller.

As for safe to expose, it's a given that it should be safe from buffer overflow problems (I'll avoid Go's 'unsafe' package). But what else of these is important:

1) Challenge/response test with a fixed shared key to smoke test for legitimate clients. (I take this as the "three-way handshake".) 2) Configurable limits on server for length of test, send interval, etc (basic DoS protection). 3) Accounts / permissions with "request and grant" (I want to do this test, will you let me?) 4) Invisibility to unauthorized clients (requires #3).

I think #1 and #2 make sense to me and are "easy", but as for #3 and #4, I assume they're not needed now. This is something that might run on public servers and you want the server to be safe, but it's not something that needs to be run securely between trusted parties across the open Internet, right?

There's no way to prevent someone from writing a rogue client and hogging up resources, but we could stop random probes with #1, reduce the impact of any attacks with #2, and obviously lock things down more with #3 and #4, with more effort.

If all this sounds reasonable, I'll just put something together and welcome any critique...

dtaht commented 7 years ago

Pete Heist notifications@github.com writes:

Ok, so I'll start with a latency only, single RTT test then. Keep things simple!

The client and server are separate, like netperf, as the server might end up a little smaller.

As for safe to expose, it's a given that it should be safe from buffer overflow problems (I'll avoid Go's 'unsafe' package). But what else of these is important:

1 Challenge/response test with a fixed shared key to smoke test for legitimate clients. (I take this as the "three-way handshake".)

Basically anything that would cause a test like this to be an inadvertent amplifier. A forged src address that just started a test with no confirmation, is bad.

So long as there is a challenge response phase (and a strict upper limit on the duration and number of tests), I can sleep at night.

I'm one of the guys that predicted the ntp amplification attacks...

2 Configurable limits on server for length of test, send interval, etc (basic DoS protection). 3 Accounts / permissions with "request and grant" (I want to do this test, will you let me?)

Tis much harder and not strictly necessary. A reply of "I'm busy now, please try again later" seems much simpler.

For how deep that rathole can go, see owamp for inspiration.

4 Invisibility to unauthorized clients (requires #3).

I think #1 and #2 make sense to me and are "easy", but as for #3 and

4, I assume they're not needed now. This is something that might run

on public servers and you want the server to be safe, but it's not something that needs to be run securely between trusted parties across the open Internet, right?

There's no way to prevent someone from writing a rogue client and hogging up resources, but we could stop random probes with #1, reduce the impact of any attacks with #2, and obviously lock things down more with #3 and #4, with more effort.

If all this sounds reasonable, I'll just put something together and welcome any critique...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

heistp commented 7 years ago

Ok, understood.

Also, a compiled-in pre-shared key for the handshake that can be overridden from the command line is still easy to implement and would allow for restricted tests, if needed.

More later when something's ready...

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

Ok, understood.

Also, a compiled-in pre-shared key for the handshake that can be overridden from the command line is still easy to implement and would allow for restricted tests, if needed.

Sure, that is useful (netperf has a similar feature), but it is also important that one can have a "public" server that will not send large amounts of unsolicited traffic to random addresses given a spoofed source IP.

I'm not too worried about this as long as the reply is no bigger than the request; then it will be no worse than normal ping (well, rate limiting may be necessary).

Also, playing nice with firewalls (the server should be able to listen on a configurable port number, and not use a large random port number space).

-Toke

heistp commented 7 years ago

Yep, I'll try to keep it to a single UDP port on the server (random on client).

There might be a bit of a delay as I still have to complete tests of Ubiquiti's stuff for FreeNet and finish the report and presentation by 5/1 (along with the day job :)

One of my main motivations for getting this test done asap though is WMM. After Dave's tip I did tests with WMM on and off (or at least avoided) and the results were really surprising. With WMM on, even when you do the default rrul (not rrul_be) test, latencies are 5-10x what they probably should be. Disable or bypass WMM and things look much, much better. So either:

1) WMM is "bad" and to be avoided, particularly for higher numbers of diffserv marked flows, OR 2) The netperf UDP_RR test, with its zero-delay back and forth (which arguably doesn't represent what you see in the real world) when marked with higher priority diffserv markings like EF or others, doesn't play well with WMM

Or something in between. Hopefully we can determine that soon.

PS- I finally did Chaos Calmer tests. LEDE's latency improvements under load are pretty staggering, particularly when it comes to dynamic rate drops. So hopefully the report I produce helps highlight the good work you guys are doing. :)

tohojo commented 7 years ago

Pete Heist notifications@github.com writes:

Yep, I'll try to keep it to a single UDP port on the server (random on client).

There might be a bit of a delay as I still have to complete tests of Ubiquiti's stuff for FreeNet and finish the report and presentation by 5/1 (along with the day job :)

One of my main motivations for getting this test done asap though is WMM. After Dave's tip I did tests with WMM on and off (or at least avoided) and the results were really surprising. With WMM on, even when you do the default rrul (not rrul_be) test, latencies are 5-10x what they probably should be. Disable or bypass WMM and things look much, much better. So either:

1 WMM is "bad" and to be avoided, particularly for higher numbers of diffserv marked flows, OR

2 The netperf UDP_RR test, with its zero-delay back and forth (which arguably doesn't represent what you see in the real world) when marked with higher priority diffserv markings like EF or others, doesn't play well with WMM

Well, the plain RRUL test will put a TCP flow in each WMM priority bin, which will hammer them pretty hard. This tends to break things, since there's no admission control on the different queues.

For instance, according to the standard, the VI queue should never send aggregates with durations longer than 1ms. The code checks this, but only for the first rate in the configured rate chain for the transmission. So if that fails, and subsequent retries drop down to a lower rate, the aggregate can suddenly be longer than 1ms, in violation of the standard.

And also, since the queues are served in strict priority order, hammering the VO and VI queues tends to lock out the others.

-Toke

heistp commented 7 years ago

On Apr 10, 2017, at 11:59 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

Well, the plain RRUL test will put a TCP flow in each WMM priority bin, which will hammer them pretty hard. This tends to break things, since there's no admission control on the different queues.

For instance, according to the standard, the VI queue should never send aggregates with durations longer than 1ms. The code checks this, but only for the first rate in the configured rate chain for the transmission. So if that fails, and subsequent retries drop down to a lower rate, the aggregate can suddenly be longer than 1ms, in violation of the standard.

And also, since the queues are served in strict priority order, hammering the VO and VI queues tends to lock out the others.

Interesting, so it sounds like even just the TCP flows by themselves are already giving WMM problems...

I'm trying to answer the question of whether or not FreeNet should be disabling or bypassing WMM in their backhaul. Since we can’t control what goes into the backhaul, it seems like either the diffserv markings need to be removed or WMM needs to be disabled or bypassed.

As for the bypassing option, I can make RRUL results look way better by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, your MTU goes down a bit, but you don’t have problems with WMM and that may be better than removing diffserv markings entirely as they’re preserved in the encapsulated packet for any routers upstream. RRUL results for rate limited cake with ‘diffserv4’ looks far better with IPIP tunneling than not. I’m asking myself why NOT to do this in production, and one main component of that is determining whether the RRUL test is an approximation of reality, at least some of the time, or not.

heistp commented 7 years ago

Pete Heist peteheist@gmail.com writes:

On Apr 10, 2017, at 11:59 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

Well, the plain RRUL test will put a TCP flow in each WMM priority bin, which will hammer them pretty hard. This tends to break things, since there's no admission control on the different queues.

For instance, according to the standard, the VI queue should never send aggregates with durations longer than 1ms. The code checks this, but only for the first rate in the configured rate chain for the transmission. So if that fails, and subsequent retries drop down to a lower rate, the aggregate can suddenly be longer than 1ms, in violation of the standard.

And also, since the queues are served in strict priority order, hammering the VO and VI queues tends to lock out the others.

Interesting, so it sounds like even just the TCP flows by themselves are already giving WMM problems...

I'm trying to answer the question of whether or not FreeNet should be disabling or bypassing WMM in their backhaul. Since we can’t control what goes into the backhaul, it seems like either the diffserv markings need to be removed or WMM needs to be disabled or bypassed.

The question is what you gain by having it on. If you're running FQ-CoDel-enabled WiFi nodes you get almost equivalent behaviour for voice traffic without using the VO queue (or at least that's what I've seen in the scenarios I've been testing). And the efficiency of the network is higher if you don't use the VO and VI queues (since they can't aggregate as much (or at all in the case of VO)). For a backhaul link (which I assume is point-to-point?) you are not going to have a lot of multi-node contention (which is where the VO queue might help as that also affects contention parameters). So a lot of the potential benefits of different 802.11 priorities are probably not relevant for this case...

As for the bypassing option, I can make RRUL results look way better by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, your MTU goes down a bit, but you don’t have problems with WMM and that may be better than removing diffserv markings entirely as they’re preserved in the encapsulated packet for any routers upstream. RRUL results for rate limited cake with ‘diffserv4’ looks far better with IPIP tunneling than not.

'Far better' how? I'm guessing we're down to the micro-optimisation level here?

I’m asking myself why NOT to do this in production, and one main component of that is determining whether the RRUL test is an approximation of reality, at least some of the time, or not.

The RRUL test is designed to be a representation of the worst case, rather than typical traffic most applications generate. So probably, in most cases you will be fine with leaving it on, since no applications will generate high volumes of VO or VI traffic. It's more of a security issue, really, making it easier to DOS your network...

-Toke

heistp commented 7 years ago

On Apr 10, 2017, at 12:37 PM, Toke Høiland-Jørgensen toke@toke.dk wrote:

Pete Heist <peteheist@gmail.com mailto:peteheist@gmail.com> writes:

Interesting, so it sounds like even just the TCP flows by themselves are already giving WMM problems...

I'm trying to answer the question of whether or not FreeNet should be disabling or bypassing WMM in their backhaul. Since we can’t control what goes into the backhaul, it seems like either the diffserv markings need to be removed or WMM needs to be disabled or bypassed.

The question is what you gain by having it on. If you're running FQ-CoDel-enabled WiFi nodes you get almost equivalent behaviour for voice traffic without using the VO queue (or at least that's what I've seen in the scenarios I've been testing). And the efficiency of the network is higher if you don't use the VO and VI queues (since they can't aggregate as much (or at all in the case of VO)). For a backhaul link (which I assume is point-to-point?) you are not going to have a lot of multi-node contention (which is where the VO queue might help as that also affects contention parameters). So a lot of the potential benefits of different 802.11 priorities are probably not relevant for this case…

Exactly, that’s what I’m thinking (for the backhaul). But WMM is required in 802.11n and later as I understand it. If you disable it in LEDE, you fall back to 802.11g speeds. You can’t disable it in Ubiquiti’s UI, but it looks like it can be done manually in their config files, another thing I need to test.

As for the bypassing option, I can make RRUL results look way better by sending backhaul traffic through an IPIP tunnel with TOS 0. Yes, your MTU goes down a bit, but you don’t have problems with WMM and that may be better than removing diffserv markings entirely as they’re preserved in the encapsulated packet for any routers upstream. RRUL results for rate limited cake with ‘diffserv4’ looks far better with IPIP tunneling than not.

'Far better' how? I'm guessing we're down to the micro-optimisation level here?

Here are three rrul results using the default LEDE config, one with no IPIP tunnel, one with an IPIP tunnel with TOS 0, and one with TOS ‘inherit’, which takes the TOS value of the encapsulated packet:

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_noipip/index.html http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_noipip/index.html

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_0/index.html http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_0/index.html

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_inh/index.html http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_default_ipip_tos_inh/index.html

The TOS 0 version basically looks like the rrul_be test, because all traffic going over the P2P link is effectively best effort. TOS ‘inherit’ basically looks the same as having no IPIP tunnel, because the packets have the same diffserv markings, just with a little bit of overhead. It makes sense, other than the fundamental point of why latencies are so much higher with the rrul test and rrul_be.

Now, I can use an IPIP tunnel with TOS 0, rate limit and shape that with Cake diffserv4, for example, and get much better results than with no IPIP tunnel:

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html

By adding the IPIP tunnel with TOS 0, average latency goes from ~50 ms to ~10 ms, and to summarize:

ICMP: 21.2 -> 7.0 ms UDP BE: 20.8 -> 7.3 ms UDP BK: 84.3 -> 14.2 ms UDP EF: 9.0 -> 8.1 ms

All the latencies go down considerably, so why not do this? Unless the test just isn’t close to reality.

Now, and this is also important, I’m making an assumption here that this is demonstrating a problem with WMM, but maybe it’s not. Is LEDE’s new driver also prioritizing based on diffserv markings, or what exactly am I seeing?

I’m asking myself why NOT to do this in production, and one main component of that is determining whether the RRUL test is an approximation of reality, at least some of the time, or not.

The RRUL test is designed to be a representation of the worst case, rather than typical traffic most applications generate. So probably, in most cases you will be fine with leaving it on, since no applications will generate high volumes of VO or VI traffic. It's more of a security issue, really, making it easier to DOS your network...

That’s what I’d like to also confirm with the tests. :) Thanks Toke for your advice...

heistp commented 7 years ago

On Apr 10, 2017, at 2:28 PM, Toke Høiland-Jørgensen toke@toke.dk wrote:

Pete Heist <peteheist@gmail.com mailto:peteheist@gmail.com> writes:

Now, I can use an IPIP tunnel with TOS 0, rate limit and shape that with Cake diffserv4, for example, and get much better results than with no IPIP tunnel:

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_noipip/index.html

http://www.drhleny.cz/bufferbloat_tmp/diffserv_rrul_eg_cake_ds4_36mbit_ipip_tos_0/index.html

By adding the IPIP tunnel with TOS 0, average latency goes from ~50 ms to ~10 ms, and to summarize:

ICMP: 21.2 -> 7.0 ms UDP BE: 20.8 -> 7.3 ms UDP BK: 84.3 -> 14.2 ms UDP EF: 9.0 -> 8.1 ms

All the latencies go down considerably, so why not do this? Unless the test just isn’t close to reality.

Now, and this is also important, I’m making an assumption here that this is demonstrating a problem with WMM, but maybe it’s not. Is LEDE’s new driver also prioritizing based on diffserv markings, or what exactly am I seeing?

The Linux WiFi stack will serve the different priority queues in strict priority order. So if you have a long backlog on the VO queue, that will basically lock out the others; which is why you see the other traffic classes worsen in this case. In addition, the VO queue can't aggregate packets, so when it is busy, the effective bandwidth of the link drops. Which I believe is the reason you're seeing Cake behave better when you're using tunnelling that hides the diffserv markings from the WiFi stack: When there tunnelling is turned off, Cake is simply no longer shaping at less than the effective bandwidth of the link.

Ok, but of the two possibilities, I think your first assessment is more likely (that the VO queue is taking priority over the others). If Cake were not in control of the queue for the no IPIP tunnel case because the effective bandwidth of the link were too low, I would expect the average bandwidth for upload and download to not be as flat as it appears.

I would like to prove this with another test limited to lower rates, but my LEDE hardware has gone back to its home as of today (springtime at the camp) running Open Mesh’s firmware. I’ll be testing Ubiquiti hardware soon. I expect its WiFi stack is the same though(?), and it will be interesting to see how it compares. Now that LEDE has a stable release build, I’m also considering trying that at the camp, but that’s another step, and maybe more tests, for later.

I’ll also have to do a real test of disabling WMM, to sort out what if any difference here is coming from WMM, or elsewhere. It was a “bridge too far” for me to try that in LEDE thus far.

I do believe the 'wash' keyword made it back into cake, btw. If you enable that, cake will zero out the diffserv marks (after acting on them); so you could get the behaviour you want without using tunneling, as long as you don't have any other bottlenecks further along the path where you need the markings...

I noticed it in the source. I’m not sure yet that removing the markings entirely is better than tunneling through the WiFi links, as they may be useful elsewhere upstream. I’m also not sure they’d make much difference on the downstream in consumer hardware though, when that’s probably not usually where the bottleneck is. Overall, in order to add the complexity of tunneling, I need to show it makes a real difference in the real world.

So maybe I’ll have more on this after the Ubiquiti results…

dtaht commented 7 years ago

Pete Heist notifications@github.com writes:

The "rightest" answer for backhaul wifi is to leave wmm enabled, but stop classifying anything into anything other than the BE queue.

That was a one line patch at the time. There is a new abstraction that I don't know how to get to, that allows setting a qos map for wifi, that can be made to do the same thing.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

heistp commented 7 years ago

On Apr 10, 2017, at 7:03 PM, Dave Täht notifications@github.com wrote:

The "rightest" answer for backhaul wifi is to leave wmm enabled, but stop classifying anything into anything other than the BE queue.

That was a one line patch at the time. There is a new abstraction that I don't know how to get to, that allows setting a qos map for wifi, that can be made to do the same thing.

Oh yeah, that's “righter” for sure. Then you’d preserve diffserv markings without tunneling. I just need to find out if this is a problem on UBNT, and fix if there if so.

Even still, I’m not done experimenting with my “one weird trick” just yet. I tested a “poor man’s full duplex WiFi” by using the fact that the transmit and receive sides of IPIP tunnels are separate, and can travel over separate routes. Unfortunately I didn’t have four WiFi devices at the time I tried it, so I used Ethernet for one direction in the link. But the Flent results were rather beautiful, and one-way latency under load for the WiFi link was ~2.5ms with Cake limiting. I just got four UBNT NanoBridge M5’s to test this fully, so will include it in my results. If it works well, then in theory you could have a full-duplex WiFi setup, with failover to half-duplex, for much less than what it usually costs. This might be useful for backhaul. (BTW, one might be able to do this with straight asymmetric routing and no tunnel also, but as I understand it, asymmetric routing is generally not something you want to do on purpose. I don’t understand all of the reasons why, beyond breaking NAT setups.)

tohojo commented 6 years ago

So any updates on any of this work? :)

-Toke

heistp commented 6 years ago

Funny you should ask...it was impossible to do anything over the summer, but in the last couple of weeks I've gotten close on the new latency tester. It took some time playing around with timer error, system vs monotonic clock values, and socket options, among other things (Windows might be mostly a lost cause on that). A few more things left to do, and I hope to update more soon...

tron:~/src/github.com/peteheist/irtt:% ./irtt -fill rand -fillall -i 10ms -l 160 -d 5s -timer comp -ts a.b.c.d
IRTT to a.b.c.d (a.b.c.d:2112)

                  RTT: mean=19.7584ms min=12.4221ms max=64.2016ms
   one-way send delay: mean=9.1482ms min=3.8003ms max=44.6247ms
one-way receive delay: mean=10.6096ms min=8.0872ms max=42.1626ms
packets received/sent: 498/499 (0.20% loss)
  bytes received/sent: 79680/79840
    receive/send rate: 127.8 Kbps / 127.7 Kbps
             duration: 5.209s (wait 193ms)
         timer misses: 1/500 (0.20% missed)
          timer error: mean=-997ns (-0.01%) min=-3.492683ms max=2.55169ms
       send call time: mean=84.8µs min=13.2µs max=180.4µs
tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

Funny you should ask...it was impossible to do anything over the summer, but in the last couple of weeks I've gotten close on the new latency tester. It took some time playing around with timer error, system vs monotonic clock values, and socket options, among other things (Windows might be mostly a lost cause on that). A few more things left to do, and I hope to update more soon...

Neat! Does it do machine parsable output, and can it output stats at intervals during the test? :)

-Toke

heistp commented 6 years ago

Now's a good time to ask about the intervals for stats. :) What interval would you send packets on during an RRUL test? And do you just want the result for every packet, or would you rather it send with a shorter interval than the interval results are reported on?

I pictured you'd run at an interval of say 100ms - 200ms and just use every result, not that you'd run at a shorter interval of say 10ms and get results every 200ms. Much shorter intervals are possible, just then there's more output if the results for all round trips are returned.

For output while the test is running there's a verbose mode (sample output below), but I think you're more interested in what will be in the JSON, which would be written all at once at the end of the test.

The usage is below for reference. There's no option there for output to JSON because it's not implemented yet. :) Things are in structs just need to add the tags and write it. Everything that's in the usage is working, as it were. Still need to finish some things with stats and the handshaking...

tron:~/src/github.com/peteheist/irtt:% ./irtt -ts -v localhost
IRTT to localhost (127.0.0.1:2112)
seq=0 len=44 rtt=169.6µs sd=125.9µs rd=43.8µs
seq=1 len=44 rtt=110.6µs sd=75.5µs rd=35.2µs
seq=2 len=44 rtt=132.9µs sd=79.5µs rd=53.4µs
seq=3 len=44 rtt=118.5µs sd=84.4µs rd=34.1µs
seq=4 len=44 rtt=100µs sd=63µs rd=37µs

                  RTT: mean=126.3µs min=100µs max=169.6µs
   one-way send delay: mean=85.6µs min=63µs max=125.9µs
one-way receive delay: mean=40.7µs min=34.1µs max=53.4µs
packets received/sent: 5/5 (0.00% loss)
  bytes received/sent: 220/220
    receive/send rate: 2.2 Kbps / 2.2 Kbps
             duration: 1s (wait 0s)
         timer misses: 0/5 (0.00% missed)
          timer error: mean=523.772µs (0.26%) min=73.792µs max=1.797681ms
       send call time: mean=29.3µs min=21.3µs max=46.2µs
irtt: measures round-trip time with isochronous UDP packets

Usage: irtt [options] host|host:port (client)
       irtt [options] -s             (server)
       irtt -h | -version            (help or version)

Client options:
---------------

-d duration    total time to send (default 1s, see Duration units below)
-i interval    send interval (default 200ms, see Duration units below)
-l length      target length of packet (including irtt headers, default 0)
               increased as necessary for irtt headers, common values:
               1472 (max unfragmented size of IPv4 datagram for 1500 byte MTU)
               1452 (max unfragmented size of IPv6 datagram for 1500 byte MTU)
-ts            request a timestamp from the server, which is used to estimate
               one-way delay (clocks must be externally synchronized)
-q             quiet, suppress all output
-v             verbose, show received packets
-dscp dscp     dscp value (default 0, 0x prefix for hex), common values:
               0 (Best effort)
               8 (Bulk)
               40 (CS5)
               46 (Expedited forwarding)
-df string     setting for do not fragment (DF) bit in all packets:
               default: OS default
               false: DF bit not set
               true: DF bit set
-wait wait     wait time at end of test for unreceived replies (default 3x4s)
               - Valid formats -
               #xduration: # times max RTT, or duration if no response
               #rduration: # times RTT, or duration if no response
               duration: fixed duration (see Duration units below)
               - Examples -
               3x4s: 3 times max RTT, or 4 seconds if no response
               1500ms: fixed 1500 milliseconds
-timer timer   timer for waiting to send packets (default simple)
               simple: Go's standard time.Timer
               comp: simple timer with error compensation (see -tcomp)
               hybrid:#: comp timer for sleep factor, busy for remainder
                   (see -tcomp, default sleep factor 0.95)
               busy: busy wait loop (high precision and CPU, blasphemy)
-tcomp alg     timer compensation averaging algorithm (default exp:0.10)
               avg: cumulative average error
               win:#: moving average error with window # (default 5)
               exp:#: exponential average with alpha # (default 0.10)
-fill fill     fill payload with given data (default none)
               none: leave payload as all zeroes
               pattern:XX: use repeating pattern of hex (default fe01)
               rand: use random bytes from Go's math.rand
-fillall       if true, fill all packets instead of repeating the first
               (makes rand unique per packet and pattern continue)
-local addr    local address (default from OS), valid formats:
               :port (all IPv4/IPv6 addresses with port)
               host (IPv4 addr or hostname with dynamic port)
               host:port (IPv4 addr or hostname with port)
               [ipv6-host%zone] (IPv6 addr or hostname with dynamic port)
               [ipv6-host%zone]:port (IPv6 addr or hostname with port)

Server options:
---------------

-b addresses   bind addresses (default :2112), comma separated list of:
               :port (all IPv4/IPv6 addresses with port)
               host (IPv4 addr or hostname with default port 2112)
               host:port (IPv4 addr or hostname with port)
               [ipv6-host%zone] (IPv6 addr or hostname with default port 2112)
               [ipv6-host%zone]:port (IPv6 addr or hostname with port)
-max-length #  max packet length (default 0)
               0 means calculate from max MTU of all interfaces
-goroutines #  number of goroutines to serve requests with (default 1)
               0 means use the number of CPUs reported by Go (8)
               increasing this adds both concurrency and lock contention
-no-timestamp  do not add timestamps when requested by client
-no-handshake  disable three-way handshake
               vulnerable to reply redirection, do not use on open Internet

Common options:
---------------

-hmac key      add HMAC with specified key to all packets, provides:
               dropping of all packets without a correct HMAC
               protection for server against unauthorized discovery and use
-4             IPv4 only
-6             IPv6 only
-ttl ttl       time to live (default 0, meaning use OS default)

Duration units:
---------------

Durations are a sequence of decimal numbers, each with optional fraction, and
unit suffix, such as: "300ms", "1m30s" or "2.5m". Sanity not enforced.

h              hours
m              minutes
s              seconds
ms             milliseconds
ns             nanoseconds
tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

Now's a good time to ask about the intervals for stats. :) What interval would you send packets on during an RRUL test? And do you just want the result for every packet, or would you rather it send with a shorter interval than the interval results are reported on?

Make it configurable and default to 1 second, like normal ping? Results should be per packet, definitely.

I pictured you'd run at an interval of say 100ms - 200ms and just use every result, not that you'd run at a shorter interval of say 10ms and get results every 200ms. Much shorter intervals are possible, just then there's more output if the results for all round trips are returned.

Sure, a shorter interval will produce more data; that is sorta the expectation :)

For output while the test is running there's a verbose mode (sample output below), but I think you're more interested in what will be in the JSON, which would be written all at once at the end of the test.

Make sure you timestamp each sample (with the time the reply was received), and add that to the output. For the time being, Flent parses everything after the fact anyway (for now), but being able to get the JSON output one sample at a time might also be useful...


tron:~/src/github.com/peteheist/irtt:% ./irtt -ts -v localhost
IRTT to localhost (127.0.0.1:2112)
seq=0 len=44 rtt=169.6µs sd=125.9µs rd=43.8µs

Unicode? Brave of you ;) I foresee that this will break in some weird terminals somewhere :P

Also, having the per-packet output on by default would match the behaviour of the normal ping tool...

seq=1 len=44 rtt=110.6µs sd=75.5µs rd=35.2µs seq=2 len=44 rtt=132.9µs sd=79.5µs rd=53.4µs seq=3 len=44 rtt=118.5µs sd=84.4µs rd=34.1µs seq=4 len=44 rtt=100µs sd=63µs rd=37µs

              RTT: mean=126.3µs min=100µs max=169.6µs

one-way send delay: mean=85.6µs min=63µs max=125.9µs one-way receive delay: mean=40.7µs min=34.1µs max=53.4µs packets received/sent: 5/5 (0.00% loss) bytes received/sent: 220/220 receive/send rate: 2.2 Kbps / 2.2 Kbps duration: 1s (wait 0s) timer misses: 0/5 (0.00% missed) timer error: mean=523.772µs (0.26%) min=73.792µs max=1.797681ms send call time: mean=29.3µs min=21.3µs max=46.2µs

irtt: measures round-trip time with isochronous UDP packets

Usage: irtt [options] host|host:port (client) irtt [options] -s (server) irtt -h | -version (help or version)

Might I suggest you consider using https://github.com/spf13/pflag ? Gets you normal GNU style long options, instead of the weird stuff that Go defaults to...

Client options:

-d duration total time to send (default 1s, see Duration units below) -i interval send interval (default 200ms, see Duration units below) -l length target length of packet (including irtt headers, default 0)

Most other tools call this size and not length. I would expect -l to set the test length.

           increased as necessary for irtt headers, common values:
           1472 (max unfragmented size of IPv4 datagram for 1500 byte MTU)
           1452 (max unfragmented size of IPv6 datagram for 1500 byte MTU)

-ts request a timestamp from the server, which is used to estimate one-way delay (clocks must be externally synchronized) -q quiet, suppress all output -v verbose, show received packets -dscp dscp dscp value (default 0, 0x prefix for hex), common values: 0 (Best effort) 8 (Bulk) 40 (CS5) 46 (Expedited forwarding) -df string setting for do not fragment (DF) bit in all packets: default: OS default false: DF bit not set true: DF bit set -wait wait wait time at end of test for unreceived replies (default 3x4s)

  • Valid formats -

    xduration: # times max RTT, or duration if no response

           #rduration: # times RTT, or duration if no response
           duration: fixed duration (see Duration units below)
  • Examples - 3x4s: 3 times max RTT, or 4 seconds if no response 1500ms: fixed 1500 milliseconds -timer timer timer for waiting to send packets (default simple) simple: Go's standard time.Timer comp: simple timer with error compensation (see -tcomp) hybrid:#: comp timer for sleep factor, busy for remainder (see -tcomp, default sleep factor 0.95) busy: busy wait loop (high precision and CPU, blasphemy)

Have you looking into timerfd? Not sure if you can use that from Go, but it gives quite good precision...

-tcomp alg timer compensation averaging algorithm (default exp:0.10) avg: cumulative average error win:#: moving average error with window # (default 5) exp:#: exponential average with alpha # (default 0.10)

What does this do?

-fill fill fill payload with given data (default none) none: leave payload as all zeroes pattern:XX: use repeating pattern of hex (default fe01) rand: use random bytes from Go's math.rand -fillall if true, fill all packets instead of repeating the first (makes rand unique per packet and pattern continue)

What's the performance impact of fillall?

-Toke

heistp commented 6 years ago

On Sep 18, 2017, at 7:01 PM, Toke Høiland-Jørgensen notifications@github.com wrote:

I pictured you'd run at an interval of say 100ms - 200ms and just use every result, not that you'd run at a shorter interval of say 10ms and get results every 200ms. Much shorter intervals are possible, just then there's more output if the results for all round trips are returned.

Sure, a shorter interval will produce more data; that is sorta the expectation :)

Ok, we’re on the same page with results recording. :)

For output while the test is running there's a verbose mode (sample output below), but I think you're more interested in what will be in the JSON, which would be written all at once at the end of the test.

Make sure you timestamp each sample (with the time the reply was received), and add that to the output. For the time being, Flent parses everything after the fact anyway (for now), but being able to get the JSON output one sample at a time might also be useful…

Yes, each sample has nanosecond timestamps from both the wall clock and monotonic clock (relative to start of test, useful for more accurately calculating packet delay variation, he said hopefully).


tron:~/src/github.com/peteheist/irtt:% ./irtt -ts -v localhost
IRTT to localhost (127.0.0.1:2112)
seq=0 len=44 rtt=169.6µs sd=125.9µs rd=43.8µs

Unicode? Brave of you ;) I foresee that this will break in some weird terminals somewhere :P

Brave of them, that’s just Go’s default output for time.Duration. And I figured hey, it’s 2017, right? :)

Also, having the per-packet output on by default would match the behaviour of the normal ping tool...

seq=1 len=44 rtt=110.6µs sd=75.5µs rd=35.2µs seq=2 len=44 rtt=132.9µs sd=79.5µs rd=53.4µs seq=3 len=44 rtt=118.5µs sd=84.4µs rd=34.1µs seq=4 len=44 rtt=100µs sd=63µs rd=37µs

Ok, I can consider that. I defaulted interval to 200ms and duration to 1s. But I often find myself testing with intervals of 100us or less to check the behavior in extreme conditions, but then output for every packet is unwanted. Hmm, maybe I could output summary stats at regular intervals in that case. Added that to the list. So this might not behave exactly like ‘ping’, we’ll see.

irtt: measures round-trip time with isochronous UDP packets

Usage: irtt [options] host|host:port (client) irtt [options] -s (server) irtt -h | -version (help or version)

Might I suggest you consider using https://github.com/spf13/pflag ? Gets you normal GNU style long options, instead of the weird stuff that Go defaults to...

Good idea, added it to the list. I’m also watching out for executable size increases from dependencies.

Client options:

-d duration total time to send (default 1s, see Duration units below) -i interval send interval (default 200ms, see Duration units below) -l length target length of packet (including irtt headers, default 0)

Most other tools call this size and not length. I would expect -l to set the test length.

I did this because I use -s for the server, and then when I saw the UDP field in RFC 768 is called ‘length’ I justified it for myself. :) Point taken though about what people might expect, I’ll think about “-server” instead. I could also split the client and server into separate executables, but so far since there wasn’t much of a size savings from that I haven’t bothered. This thing is mostly Go standard libs and my 2 Kloc mosquito on top.

-timer timer timer for waiting to send packets (default simple) simple: Go's standard time.Timer comp: simple timer with error compensation (see -tcomp) hybrid:#: comp timer for sleep factor, busy for remainder (see -tcomp, default sleep factor 0.95) busy: busy wait loop (high precision and CPU, blasphemy)

Have you looking into timerfd? Not sure if you can use that from Go, but it gives quite good precision…

I haven’t. By the way, the entire thing is split into an API and the irtt command line app. So one (or me) can just implement the irtt.Timer interface with something better.

So far, the Linux timer (Debian 8 on a raspi 2 so far) is already quite good. The OS X timer seems to have much higher error, which increases at smaller intervals (not going to even start on Windows), which led me to...

-tcomp alg timer compensation averaging algorithm (default exp:0.10) avg: cumulative average error win:#: moving average error with window # (default 5) exp:#: exponential average with alpha # (default 0.10)

What does this do?

“-timer comp” monitors the timer error and adds a correction factor. “-tcomp” selects the moving average algorithm to use to select the correction factor. On OS X exponential average seems to work well, but a simple moving average with window works well too, better than a cumulative average. I added the option so I could play with it more. Here are two runs on OS X at a 200us interval and you can see the improvement of timer compensation (true, when it comes to error, ‘min’ and ‘max’ should probably be by absolute value...):

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=92.062µs (46.03%) min=8.427µs max=607.725µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -timer comp -tcomp exp:0.1 -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=135ns (0.07%) min=-71.77µs max=628.592µs

-fill fill fill payload with given data (default none) none: leave payload as all zeroes pattern:XX: use repeating pattern of hex (default fe01) rand: use random bytes from Go's math.rand -fillall if true, fill all packets instead of repeating the first (makes rand unique per packet and pattern continue)

What's the performance impact of filial?

Depends on which fill is used. Here’s an extreme case of blasting 9000 byte packets as fast as possible:

tron:~/src/github.com/peteheist/irtt:% ./irtt -l 9000 -i 1ns localhost RTT: mean=335.2µs min=25.8µs max=3.3848ms receive/send rate: 4.417 Gbps / 5.539 Gbps

tron:~/src/github.com/peteheist/irtt:% ./irtt -l 9000 -i 1ns -fill pattern:cafebabe -fillall localhost IRTT to localhost (127.0.0.1:2112) RTT: mean=42.5µs min=30.8µs max=693.5µs receive/send rate: 1.737 Gbps / 1.737 Gbps

tron:~/src/github.com/peteheist/irtt:% ./irtt -l 9000 -i 1ns -fill rand -fillall localhost IRTT to localhost (127.0.0.1:2112) RTT: mean=43.9µs min=28.9µs max=1.3064ms receive/send rate: 1.995 Gbps / 1.994 Gbps

“-fill pattern" is just my simple ‘for' loop for now that copies the pattern byte by byte, so the longer the pattern, the faster it is as you’re not restarting the loop counter as often. I could improve its performance later by building a larger buffer first and using Go’s copy builtin, which I’ll do if “fill” is something people need more.

“-fill rand” is Go’s math.rand, standard non-crypto-quality randomness. I don’t see that secure random data is necessary to where I need to take the hit of importing that, but it’s easy to add.

But since the packet is prepared before it’s time to send and latency is measured from right before calling into the OS to send, I don’t suppose “-fill" should affect latency measurements, other than how it would affect any other things in the stack, like if someone’s testing over a link with compression or something.

Latency actually drops in the case above when adding -fill, but I suspect that’s simply because the throughput has gone down considerably and you’re not filling up the stack with packets as quickly.

dtaht commented 6 years ago

OWD would be marvelous. The only other tool we have that does that is owamp which has some nervous-making overly portable timing code in it that I've never got around to improving.

I am increasingly interested in sub 10ms measurements as basically 3-4ms is the current noise floor where 100usec or so would be rather helpful.

I am a fan of timerfds (in select loops) but perhaps go's timer implementation is sane. Otherwise there is an example of use in misc/tc_iterate.c

and...

go pete!

heistp commented 6 years ago

On Sep 19, 2017, at 1:26 AM, Dave Täht notifications@github.com wrote:

OWD would be marvelous. The only other tool we have that does that is owamp which has some nervous-making overly portable timing code in it that I've never got around to improving.

OWD has been working and is fun to play with, only it's subject to wall clock synchronization of course. My MBP clock tends to speed up when it's sleeping (~4 sec overnight) so I have to wait a while in the morning until NTP fixes it before OWD looks correct. Even then, how can I know how accurate it is?

Last night I got packet delay variation (jitter) calculations working with the monotonic clock and am playing with that. Those calculations work fine now regardless of wall clock synchronization because they use the monotonic clock, which also prevents NTP from disrupting those results if it corrects in the middle of the test. Here’s one minute of something like a G.711 VoIP call between two sites that access the Internet with point-to-point WiFi:

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 1m -i 10ms -l 160 -timer hybrid -ts a.b.c.d
IRTT to a.b.c.d (a.b.c.d:2112)

              RTT: mean=16.5ms min=11.5ms max=94.1ms
    RTT variation: mean=2.85ms min=12ns max=79.5ms
       send delay: mean=7.05ms min=3.62ms max=61.8ms

send delay variation: mean=2.6ms min=220ns max=57ms receive delay: mean=9.42ms min=7.54ms max=71ms recv delay variation: mean=1.18ms min=252ns max=61.5ms packets received/sent: 5999/6000 (0.02% loss) bytes received/sent: 959840/960000 receive/send rate: 128.0 Kbps / 128.0 Kbps duration: 1m0s (wait 282.3ms) timer misses: 0/6000 (0.00% missed) timer error: mean=27.8µs (0.28%) min=0s max=1.48ms send call time: mean=73.8µs min=12.6µs max=195µs

I changed timer error to the absolute value, otherwise when I add a timer correction with averaging, the mean error tends to average out to around zero over time now doesn’t it? :) Now the differences in error between timer options (simple, comp, hybrid and busy) are clearer:

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 5s -i 10ms -l 160 -timer simple -ts localhost timer error: mean=1.49ms (14.92%) min=19µs max=2.54ms

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 5s -i 10ms -l 160 -timer comp -ts localhost timer error: mean=394µs (3.94%) min=77ns max=2.15ms

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 5s -i 10ms -l 160 -timer hybrid -ts localhost timer error: mean=37µs (0.37%) min=0s max=1.99ms

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 5s -i 10ms -l 160 -timer busy -ts localhost timer error: mean=75ns (0.00%) min=0s max=10.3µs

I am increasingly interested in sub 10ms measurements as basically 3-4ms is the current noise floor where 100usec or so would be rather helpful.

That’s also why I made the server actually put four timestamps total in the packet- receive wall & monotonic plus send wall & monotonic. It may be obsessive, but the mean processing (turnaround) time on the server is up to 5us on my raspi, 2-3us on the MBP, and after thousands of packets I can see outliers up to 200us. I just don’t want to be part of the problem, so I subtract that processing time out from the RTT, and the one-way calculations are a bit “accurate-er” also since they used the times right after receive and right before send.

But even if I call ‘write’ to send the UDP packet with excellent precision, there’s still a lot that goes on before the packet is released on the network to where I’m not sure how precise it is when the bits actually go out the wire. I may be scraping for a few microseconds only to suffer tens or hundreds through the stack. Oh well.

I am a fan of timerfds (in select loops) but perhaps go's timer implementation is sane. Otherwise there is an example of use in misc/tc_iterate.c

and...

go pete!

Thanks! I’ll try to wrap this up soon so it can be put to use. I suspect it will be much more interesting to look at in flent than my dry text results.

not sure how Go’s timer is implemented, but I see a constant ‘Timerfd’ in a comment in sys_linux.go in the ‘Unimplemented’ section. And there seem to be more Go timer projects on Github, so there might be more to do with that...

tohojo commented 6 years ago

-tcomp alg timer compensation averaging algorithm (default exp:0.10) avg: cumulative average error win:#: moving average error with window # (default 5) exp:#: exponential average with alpha # (default 0.10)

What does this do?

“-timer comp” monitors the timer error and adds a correction factor. “-tcomp” selects the moving average algorithm to use to select the correction factor. On OS X exponential average seems to work well, but a simple moving average with window works well too, better than a cumulative average. I added the option so I could play with it more. Here are two runs on OS X at a 200us interval and you can see the improvement of timer compensation (true, when it comes to error, ‘min’ and ‘max’ should probably be by absolute value...):

So by "compensate" you mean "adjust the interval if the timer runs late"?

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=92.062µs (46.03%) min=8.427µs max=607.725µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -timer comp -tcomp exp:0.1 -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=135ns (0.07%) min=-71.77µs max=628.592µs

Median values might be useful here to get an idea of the distribution. The span between min and max seems to increase from your compensation?

But since the packet is prepared before it’s time to send and latency is measured from right before calling into the OS to send, I don’t suppose “-fill" should affect latency measurements, other than how it would affect any other things in the stack, like if someone’s testing over a link with compression or something.

Ah good, well as long as you don't include the fill time in the latency calculation, this should be fine in most use cases :)

-Toke

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 1:26 AM, Dave Täht notifications@github.com wrote:

OWD would be marvelous. The only other tool we have that does that is owamp which has some nervous-making overly portable timing code in it that I've never got around to improving.

OWD has been working and is fun to play with, only it's subject to wall clock synchronization of course. My MBP clock tends to speed up when it's sleeping (~4 sec overnight) so I have to wait a while in the morning until NTP fixes it before OWD looks correct. Even then, how can I know how accurate it is?

If you want microsecond precision, you'll generally need to run PTP between the nodes; NTP is not good enough for that (in my experience).

Last night I got packet delay variation (jitter) calculations working with the monotonic clock and am playing with that.

Measured how? The term 'jitter' is not strictly well-defined. See https://en.wikipedia.org/wiki/Packet_delay_variation

Those calculations work fine now regardless of wall clock synchronization because they use the monotonic clock, which also prevents NTP from disrupting those results if it corrects in the middle of the test. Here’s one minute of something like a G.711 VoIP call between two sites that access the Internet with point-to-point WiFi:

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 1m -i 10ms -l 160 -timer hybrid -ts a.b.c.d
IRTT to a.b.c.d (a.b.c.d:2112)

              RTT: mean=16.5ms min=11.5ms max=94.1ms
    RTT variation: mean=2.85ms min=12ns max=79.5ms

Is this what you referred to as jitter above? I'm generally wary of the term 'variation' as it's often confused with 'variance'. You may want to consider picking a different term. Also, adding median values to these statistics would be good :)

But even if I call ‘write’ to send the UDP packet with excellent precision, there’s still a lot that goes on before the packet is released on the network to where I’m not sure how precise it is when the bits actually go out the wire. I may be scraping for a few microseconds only to suffer tens or hundreds through the stack. Oh well.

Well, the stack can often simply be considered part of the network that you want to measure, so this is not too big of a problem, I'd say...

-Toke

heistp commented 6 years ago

On Sep 19, 2017, at 10:35 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

“-timer comp” monitors the timer error and adds a correction factor. “-tcomp” selects the moving average algorithm to use to select the correction factor. On OS X exponential average seems to work well, but a simple moving average with window works well too, better than a cumulative average. I added the option so I could play with it more. Here are two runs on OS X at a 200us interval and you can see the improvement of timer compensation (true, when it comes to error, ‘min’ and ‘max’ should probably be by absolute value...):

So by "compensate" you mean "adjust the interval if the timer runs late”?

It means to adjust the sleep time if the timer runs late. It makes make every effort to keep the interval (meaning the period on which send is called) the interval, relative to the start of the test, but to get there, check the time before and after sleep is called, then on each call to sleep apply the current correction factor to the actual sleep that’s requested.

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=92.062µs (46.03%) min=8.427µs max=607.725µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -timer comp -tcomp exp:0.1 -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=135ns (0.07%) min=-71.77µs max=628.592µs

Median values might be useful here to get an idea of the distribution. The span between min and max seems to increase from your compensation?

Median could be nice, but I currently don’t store the timer error per-packet, and doing a running calculation of median is a harder problem. I did add standard deviation today with Welford’s method and need to get that in the output.

That brings up a point, I don’t store either timer error or send call times per-packet (attempt to keep memory usage down). Would you like to have that info per-packet? It’s certainly possible. Each result currently takes 64 bytes in memory, and adding those would take it to 80 bytes per result. That only starts to matter more at lower intervals.

I definitely plan to do a median calculation for the packet timings at the end of the test. To do that calculation I do have results for each packet.

I’m not sure the min and max increase is statistically significant. Here are two runs for ten seconds instead of one second that show the opposite result (also note that the error is reflected more accurately than my previous runs since I’m using the absolute value of the error instead), and a third run with the hybrid timer (hybrid 80% sleep / 20% busy in this case):

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us -d 10s localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=89.4µs (44.72%) min=6.36µs max=544µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us -d 10s -timer comp localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=7.29µs (3.65%) min=0s max=521µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us -d 10s -timer hybrid:0.8 localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=447ns (0.22%) min=0s max=574µs

heistp commented 6 years ago

On Sep 19, 2017, at 10:45 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

If you want microsecond precision, you'll generally need to run PTP between the nodes; NTP is not good enough for that (in my experience).

Yeah, I’m seeing error that looks like it’s within a few milliseconds (when it’s not wildly off after startup), but it’s pretty good. Thanks for that tip, I’ll check it out.

Last night I got packet delay variation (jitter) calculations working with the monotonic clock and am playing with that.

Measured how? The term 'jitter' is not strictly well-defined. See https://en.wikipedia.org/wiki/Packet_delay_variation https://en.wikipedia.org/wiki/Packet_delay_variation

I read that page too as I was struggling with this. :) So far, I have just min, mean, max and standard deviation (median to come) of the absolute value of the packet delay variation (as defined in that Wikipedia article, and apparently also defined in some ITU-T recommendation). I’m open, and could probably use some advice, on what’s statistically best, but I definitely want to make sure the raw results are right before statistical analysis.

tron:~/src/github.com/peteheist/irtt:% ./irtt -d 1m -i 10ms -l 160 -timer hybrid -ts a.b.c.d IRTT to a.b.c.d (a.b.c.d:2112)

RTT: mean=16.5ms min=11.5ms max=94.1ms RTT variation: mean=2.85ms min=12ns max=79.5ms

Is this what you referred to as jitter above? I'm generally wary of the term 'variation' as it's often confused with 'variance'. You may want to consider picking a different term. Also, adding median values to these statistics would be good :)

You’re right, I’m conflating the term from packet delay variation, which that Wikipedia article makes clearer also. Any ideas? PDV? Just jitter and an arm wave? :)

But even if I call ‘write’ to send the UDP packet with excellent precision, there’s still a lot that goes on before the packet is released on the network to where I’m not sure how precise it is when the bits actually go out the wire. I may be scraping for a few microseconds only to suffer tens or hundreds through the stack. Oh well.

Well, the stack can often simply be considered part of the network that you want to measure, so this is not too big of a problem, I'd say...

Great point, it’s actually the very part you want to measure if bloat is happening on the test client machine before the packet is released.

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 10:35 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

“-timer comp” monitors the timer error and adds a correction factor. “-tcomp” selects the moving average algorithm to use to select the correction factor. On OS X exponential average seems to work well, but a simple moving average with window works well too, better than a cumulative average. I added the option so I could play with it more. Here are two runs on OS X at a 200us interval and you can see the improvement of timer compensation (true, when it comes to error, ‘min’ and ‘max’ should probably be by absolute value...):

So by "compensate" you mean "adjust the interval if the timer runs late”?

It means to adjust the sleep time if the timer runs late. It makes make every effort to keep the interval (meaning the period on which send is called) the interval, relative to the start of the test, but to get there, check the time before and after sleep is called, then on each call to sleep apply the current correction factor to the actual sleep that’s requested.

Right, that's what I thought; but explicitly putting that into the docs would probably be good. And maybe turning it on by default if it does more good than harm? :)

tron:~/src/github.com/peteheist/irtt:% ./irtt -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=92.062µs (46.03%) min=8.427µs max=607.725µs

tron:~/src/github.com/peteheist/irtt:% ./irtt -timer comp -tcomp exp:0.1 -i 200us localhost IRTT to localhost (127.0.0.1:2112) timer error: mean=135ns (0.07%) min=-71.77µs max=628.592µs

Median values might be useful here to get an idea of the distribution. The span between min and max seems to increase from your compensation?

Median could be nice, but I currently don’t store the timer error per-packet, and doing a running calculation of median is a harder problem. I did add standard deviation today with Welford’s method and need to get that in the output.

Fair enough. I don't think this will be something that most people will be interested in anyway...

That brings up a point, I don’t store either timer error or send call times per-packet (attempt to keep memory usage down). Would you like to have that info per-packet? It’s certainly possible. Each result currently takes 64 bytes in memory, and adding those would take it to 80 bytes per result. That only starts to matter more at lower intervals.

I think that in most cases this will not be terribly useful, and would just take up space in the Flent data files. So nah, leave it out for now.

BTW, it occurs to me that you may want to set a lower bound on the interval that a non-privileged user can pick. Ping sets this at 200 ms, which is probably too high, but maybe 1ms? The risk is mitigated somewhat by the initial handshake, but we don't want this to become a DoS tool. Also, maybe more importantly, the server should be able to enforce a lower bound on the interval that it will reply to.

-Toke

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 10:45 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

If you want microsecond precision, you'll generally need to run PTP between the nodes; NTP is not good enough for that (in my experience).

Yeah, I’m seeing error that looks like it’s within a few milliseconds (when it’s not wildly off after startup), but it’s pretty good. Thanks for that tip, I’ll check it out.

Last night I got packet delay variation (jitter) calculations working with the monotonic clock and am playing with that.

Measured how? The term 'jitter' is not strictly well-defined. See https://en.wikipedia.org/wiki/Packet_delay_variation https://en.wikipedia.org/wiki/Packet_delay_variation

I read that page too as I was struggling with this. :) So far, I have just min, mean, max and standard deviation (median to come) of the absolute value of the packet delay variation (as defined in that Wikipedia article, and apparently also defined in some ITU-T recommendation).

So that's the instantaneous PDV? I agree that is probably the most reasonable measure to use in this case. Only drawback is that the term is not that well-known; if you just write "IPDV" no one is going to understand what you mean, and "instantaneous packet delay variation" is awfully long...

-Toke

heistp commented 6 years ago

On Sep 19, 2017, at 11:43 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

Measured how? The term 'jitter' is not strictly well-defined. See https://en.wikipedia.org/wiki/Packet_delay_variation https://en.wikipedia.org/wiki/Packet_delay_variation

I read that page too as I was struggling with this. :) So far, I have just min, mean, max and standard deviation (median to come) of the absolute value of the packet delay variation (as defined in that Wikipedia article, and apparently also defined in some ITU-T recommendation).

So that's the instantaneous PDV? I agree that is probably the most reasonable measure to use in this case. Only drawback is that the term is not that well-known; if you just write "IPDV" no one is going to understand what you mean, and "instantaneous packet delay variation" is awfully long...

Yes, the mean/min/max/stddev of all of the instantaneous PDVs. Maybe “IPDV (jitter)” would work along with an explanation in the docs.

If a packet is dropped then you lose two values, the ones before and after the drop. That reminds me, I should take that into account so I don’t include zeroes in the stats for drops (making IPDV look better erroneously), but I suppose I’ll leave an empty string in the raw data for ease of consumption.

heistp commented 6 years ago

On Sep 19, 2017, at 11:34 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

BTW, it occurs to me that you may want to set a lower bound on the interval that a non-privileged user can pick. Ping sets this at 200 ms, which is probably too high, but maybe 1ms? The risk is mitigated somewhat by the initial handshake, but we don't want this to become a DoS tool. Also, maybe more importantly, the server should be able to enforce a lower bound on the interval that it will reply to.

Ok, I’ll try getting that info from Go. Sometimes you can touch something in the standard library (like home directory) and lose the ability to cross-compile without a fuss. I’ll see.

I still have more work to do to make it safe for public use. HMAC-MD5 works well and doesn’t affect results at sane intervals, but that’s not how it will be used on public servers.

For now, I’m picturing just a per-client-IP limit on the number of requests per second that are responded to. It could have a maximum bitrate as well, but I want to start somewhere because one can really get into the weeds. It could block IPs associated with malformed packets, try to identify DDoS attacks, have a global rate for handshakes allowed, etc. Will try to get something going and we’ll see then...

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 11:43 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

Measured how? The term 'jitter' is not strictly well-defined. See https://en.wikipedia.org/wiki/Packet_delay_variation https://en.wikipedia.org/wiki/Packet_delay_variation

I read that page too as I was struggling with this. :) So far, I have just min, mean, max and standard deviation (median to come) of the absolute value of the packet delay variation (as defined in that Wikipedia article, and apparently also defined in some ITU-T recommendation).

So that's the instantaneous PDV? I agree that is probably the most reasonable measure to use in this case. Only drawback is that the term is not that well-known; if you just write "IPDV" no one is going to understand what you mean, and "instantaneous packet delay variation" is awfully long...

Yes, the mean/min/max/stddev of all of the instantaneous PDVs. Maybe “IPDV (jitter)” would work along with an explanation in the docs.

Yup, think that would work..

If a packet is dropped then you lose two values, the ones before and after the drop. That reminds me, I should take that into account so I don’t include zeroes in the stats for drops (making IPDV look better erroneously), but I suppose I’ll leave an empty string in the raw data for ease of consumption.

Yeah, there's some issue with the measure being undefined for the first packet, and for packets that never make it through.

What happens if replies arrive out of order?

-Toke

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 11:34 AM, Toke Høiland-Jørgensen notifications@github.com wrote:

BTW, it occurs to me that you may want to set a lower bound on the interval that a non-privileged user can pick. Ping sets this at 200 ms, which is probably too high, but maybe 1ms? The risk is mitigated somewhat by the initial handshake, but we don't want this to become a DoS tool. Also, maybe more importantly, the server should be able to enforce a lower bound on the interval that it will reply to.

Ok, I’ll try getting that info from Go. Sometimes you can touch something in the standard library (like home directory) and lose the ability to cross-compile without a fuss. I’ll see.

I still have more work to do to make it safe for public use. HMAC-MD5 works well and doesn’t affect results at sane intervals, but that’s not how it will be used on public servers.

For now, I’m picturing just a per-client-IP limit on the number of requests per second that are responded to. It could have a maximum bitrate as well, but I want to start somewhere because one can really get into the weeds. It could block IPs associated with malformed packets, try to identify DDoS attacks, have a global rate for handshakes allowed, etc. Will try to get something going and we’ll see then...

Sure, doesn't have to be anything fancy.

That reminds me, maybe it would be useful to be able to set a target bitrate? I.e., if you set interval and bitrate, the packet size will be calculated, and if you set packet size and bitrate the interval will be?

-Toke

heistp commented 6 years ago

On Sep 19, 2017, at 12:51 PM, Toke Høiland-Jørgensen notifications@github.com wrote:

If a packet is dropped then you lose two values, the ones before and after the drop. That reminds me, I should take that into account so I don’t include zeroes in the stats for drops (making IPDV look better erroneously), but I suppose I’ll leave an empty string in the raw data for ease of consumption.

Yeah, there's some issue with the measure being undefined for the first packet, and for packets that never make it through.

What happens if replies arrive out of order?

There’s no problem with out of order packets as far as recording packet loss and timing goes. A RoundTrip in the results is identified by its sequence number, not the receive order. In OS X I can just blast packets at a really high rate to the local adapter, then they often come back out of order, which is useful for testing.

Honestly I’m not sure what to do with out of order packets as far as IPDV is concerned. Should my measurements of successive variation be in the order the packets were received, or as it is now, which is sequence number order (order of send)? Section 3.6 of RFC 3393 makes what to do with dups clear, but unless I’m missing something it doesn’t seem to spell out what to do for out of order packets. So I’ll punt for now, put it in the list and leave it, until I know more...

heistp commented 6 years ago

On Sep 19, 2017, at 12:52 PM, Toke Høiland-Jørgensen notifications@github.com wrote:

That reminds me, maybe it would be useful to be able to set a target bitrate? I.e., if you set interval and bitrate, the packet size will be calculated, and if you set packet size and bitrate the interval will be?

That could also be useful, added to the list as well (not promising everything day one just keeping track).

By the way bit rates now are for the UDP payload, without IP, UDP or hardware headers. That can also be something to add later if it’s needed.

dtaht commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 12:51 PM, Toke Høiland-Jørgensen notifications@github.com wrote:

If a packet is dropped then you lose two values, the ones before and after the drop. That reminds me, I should take that into account so I don’t include zeroes in the stats for drops (making IPDV look better erroneously), but I suppose I’ll leave an empty string in the raw data for ease of consumption.

Yeah, there's some issue with the measure being undefined for the first packet, and for packets that never make it through.

What happens if replies arrive out of order?

There’s no problem with out of order packets as far as recording packet loss and timing goes. A RoundTrip in the results is identified by its sequence number, not the receive order. In OS X I can just blast packets at a really high rate to the local adapter, then they often come back out of order, which is useful for testing.

Honestly I’m not sure what to do with out of order packets as far as IPDV is concerned. Should my measurements of successive variation be in the order the packets were received, or as it is now, which is sequence number order (order of send)? Section 3.6 of RFC 3393 makes what to do with dups clear, but unless I’m missing something it doesn’t seem to spell out what to do for out of order packets. So I’ll punt for now, put it in the list and leave it, until I know more...

I am all in favor of running code whilst trying to get a real picture of something.

That said, if you want your eyes to bleed a little more, see:

https://tools.ietf.org/html/rfc4656

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

tohojo commented 6 years ago

Pete Heist notifications@github.com writes:

On Sep 19, 2017, at 12:51 PM, Toke Høiland-Jørgensen notifications@github.com wrote:

If a packet is dropped then you lose two values, the ones before and after the drop. That reminds me, I should take that into account so I don’t include zeroes in the stats for drops (making IPDV look better erroneously), but I suppose I’ll leave an empty string in the raw data for ease of consumption.

Yeah, there's some issue with the measure being undefined for the first packet, and for packets that never make it through.

What happens if replies arrive out of order?

There’s no problem with out of order packets as far as recording packet loss and timing goes. A RoundTrip in the results is identified by its sequence number, not the receive order. In OS X I can just blast packets at a really high rate to the local adapter, then they often come back out of order, which is useful for testing.

It does pose a potential problem for reporting, though, in that you can get multiple data points for the same time. I think it will mostly make the graphs look odd, though, so not a huge issue.

Honestly I’m not sure what to do with out of order packets as far as IPDV is concerned. Should my measurements of successive variation be in the order the packets were received, or as it is now, which is sequence number order (order of send)? Section 3.6 of RFC 3393 makes what to do with dups clear, but unless I’m missing something it doesn’t seem to spell out what to do for out of order packets. So I’ll punt for now, put it in the list and leave it, until I know more...

Hmm, I'd say either report it in sequence number order no matter what, or consider the packet lost as far as IPDV is concerned once you get the next one... Suddenly using a different order is bound to lead to confusion.

-Toke

heistp commented 6 years ago

Making my way through the laundry list.

Regarding security, I suspect that allowing timestamps from the server could aid an attacker in OS fingerprinting. One might be able to compare the change in monotonic vs wall clock and learn something. Any opinion on whether this is actually a concern?

So far, I'm just adding a flag to set the timestamp mode that the server allows (none, single or dual timestamps, dual being stamps on both receive and send). I could also have a flag to disallow the monotonic portion of the timestamps, but I don't want this to get too complicated. I'd rather hear that there isn't any significant risk to exposing both wall and monotonic values and leave it be... :)