Zoxc / crusader

A network throughput and latency tester.
Apache License 2.0
115 stars 8 forks source link

Need help explaining Crusader plots #40

Closed richb-hanover closed 1 month ago

richb-hanover commented 2 months ago

@Zoxc @dtaht - I have a bunch of questions about what Crusader plots actually show, and how to explain lousy tests.

This is a first cut for a possible page for the Crusader repo. I would love to get your comments on the following - I'll keep tweaking up this article until it's "good enough". Thanks.


Comparison of Crusader Plots

A Crusader test packs a lot of data into a simple display. Here's how to understand what it's showing:

Over Ethernet

Mac mini M2 Ethernet to RPi4-plot 2024-08-18 21 28 06

This is a plot of a pretty good test. It was performed between two devices on Ethernet - an M2 Mac mini running Crusader GUI connected to a Crusader Server running on a Raspberry Pi 4. Here is what it shows:

Over Wi-Fi from Living Room

MacBook in Living Room to RPi4-plot 2024-08-19 16 48 24

This is a mess. That is, the Crusader test shows my Wi-Fi network is a wreck.

It's a test run from an Intel MacBook with Crusader GUI over Wi-Fi, tested against the same Raspberry Pi4 running the server. (My router is a Belkin RT3200 with OpenWrt 22.03.5. I am currently afraid to upgrade to 23.05.4 - search the forums for "OKD") I see:

WiFi from Dining Room

Macbook in Dining Room to RPi4-plot 2024-08-19 11 26 04

This looks a little better...

Zoxc commented 2 months ago

Over Ethernet

  • The "Both" shows transfers in both directions running at nearly their full rate. (Is it the sum of the two directions?)

It is the sum.

  • The Download latency is a bit spiky, with most values ranging between 5 and 25 msec. (Can anyone say why?)

Possibly some bufferbloat on your Pi. Is it running fq-codel?

  • The "Both" latency is ???? (not sure how to describe it - is it the sum of the up and down latency? Something else?)

There's no "Both" latency. It just the latency measured during the "Both" load test. "Total latency" would be the sum of up and down.

Over Wi-Fi from Living Room

  • Why is the "Both" throughput pulled down to close to the Upload?

The packet loss and massive bufferbloat is probably interfering with the download TCP connections.

  • What other observations are there from this plot?

There's some regular latency spikes which looks like the station being busy scanning or unable to send data for some reason.

WiFi from Dining Room

There's still some regular latency spikes here, not sure about those.

Other avenues for explanation

Can I run the Crusader Server on my router?

That's reasonable if you're testing another device, say an AP, and the router can provide enough bandwidth. MikroTik's RouterOS has container support and Mikrotik RB4011 seems to be able to send around 5 Gbps.

richb-hanover commented 2 months ago

@Zoxc Thanks for your comments. I have updated the info above with them. MORE COMMENTS, PLEASE :-)

You asked: Is the Pi running fq_codel?

I don't know for sure. Here's the output of the tc command:

deploy@rpi4:~$ sudo tc qdisc show dev eth0
[sudo] password for deploy:
qdisc mq 0: root
qdisc fq_codel 0: parent :5 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64
qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64

I don't really understand this stuff, but it appears that the root qdisc is "mq", with five child qdisc's of fq_codel. (This is the default on Ubuntu 22.04 that's running on the RPi.) Would it make a difference to remove all the child qdisc's and just install fq_codel on the root? Thanks again

Zoxc commented 2 months ago

Here's the output of the tc command:

That looks reasonable. It's possible the Pi's network drivers don't use BQL so there could be some bufferbloat in them still.

richb-hanover commented 2 months ago

Hmmm... That doesn't seem likely:

  1. It's Ubuntu 22.04 - pretty modern
  2. ll /sys/class/net/eth0/queues/tx-0/ shows:
    deploy@rpi4:~$ ll /sys/class/net/eth0/queues/tx-0/
    total 0
    drwxr-xr-x 3 root root    0 Aug 20 14:04 ./
    drwxr-xr-x 8 root root    0 Aug 20 14:04 ../
    drwxr-xr-x 2 root root    0 Aug 20 14:04 byte_queue_limits/
    -r--r--r-- 1 root root 4096 Aug 20 14:05 traffic_class
    -rw-r--r-- 1 root root 4096 Aug 20 14:05 tx_maxrate
    -r--r--r-- 1 root root 4096 Aug 20 14:05 tx_timeout
    -rw-r--r-- 1 root root 4096 Aug 20 14:05 xps_cpus
    -rw-r--r-- 1 root root 4096 Aug 20 14:05 xps_rxqs

Maybe @dtaht has some thoughts...

richb-hanover commented 2 months ago

PS My new router arrived today (GL.iNet MT6000) and I'll run the tests against its stock firmware and current OpenWrt soon

Zoxc commented 2 months ago

I added https://github.com/Zoxc/crusader/discussions/categories/wi-fi-routers-and-access-points so we can collect results for specific devices there.

dtaht commented 2 months ago

I am so looking forward to a comparison with the mt6000!

richb-hanover commented 2 months ago

@dtaht - I ran a test with the Docker container running on the MT6000. Performance was pretty bad. See https://github.com/Zoxc/crusader/discussions/43

But I now realize that I can run the pre-built binary on that device. That experiment is queued up for this weekend.

richb-hanover commented 1 month ago

Let's see if the 0.3 docs are good enough