dtaht / sch_cake

Out of tree build for the new cake qdisc
101 stars 35 forks source link

Per host fairness not working with ipv4 and lede #46

Closed dtaht closed 6 years ago

dtaht commented 7 years ago

IPv6 per host fairness works:

http://blog.cerowrt.org/flent/fq_host/ipv6_per_host_fairness_working.png

IPv4 per host fairness does not (this is with the nat option and triple-isolate). I will retest, but can someone else give it a go?

(also sqm-scripts and luci-app-sqm have no means to set the "nat" option)

dtaht commented 7 years ago

I redid the tests to confirm. Tests are now in git and in: http://blog.cerowrt.org/flent/fq_host2/

perhostfairness_up perhostfairness_down

My sqm config:

config queue 'eth0'
        option enabled '1'
        option interface 'eth0'
        option download '85000'
        option upload '5500'
        option qdisc 'cake'
        option qdisc_advanced '1'
        option squash_dscp '1'
        option debug_logging '0'
        option verbosity '5'
        option squash_ingress '0'
        option ingress_ecn 'ECN'
        option egress_ecn 'ECN'
        option qdisc_really_really_advanced '1'
        option linklayer 'none'
        option script 'layer_cake.qos'
        option iqdisc_opts 'nat'
        option eqdisc_opts 'nat'

Some stats:

root@lorna-gw:/etc/config# tc -s qdisc show dev eth0
qdisc cake 801c: root refcnt 2 bandwidth 5500Kbit diffserv4 triple-isolate nat wash rtt 100.0ms noatm overhead 14 
 Sent 137404332 bytes 587345 pkt (dropped 5, overlimits 664214 requeues 0) 
 backlog 0b 0p requeues 0 
 memory used: 199584b of 4Mb
 capacity estimate: 5500Kbit
                 Bulk   Best Effort      Video       Voice
  thresh     343744bit    5500Kbit    2750Kbit    1375Kbit
  target        52.9ms       5.0ms       6.6ms      13.2ms
  interval     147.9ms     100.0ms     101.6ms     108.2ms
  pk_delay         0us       4.4ms       393us       647us
  av_delay         0us       485us         7us        37us
  sp_delay         0us         5us         7us         6us
  pkts               0      587185          15         150
  bytes              0   137388314        1350       20724
  way_inds           0       24249           0           0
  way_miss           0        3388          15          26
  way_cols           0           0           0           0
  drops              0           5           0           0
  marks              0       45947           0           0
  sp_flows           0           8           0           0
  bk_flows           0           1           0           0
  un_flows           0           0           0           0
  max_len            0        1514          90         590
dtaht commented 7 years ago

Also the bulk target and interval appears to be wacky?

moeller0 commented 7 years ago

Mmmh, could you to add dual-srchost to "option eqdisc_opts" 'nat' and dual-dsthost to "option iqdisc" for all its undirectional niceness triple-isolate has no human language description that makes it easy to predict how it will behave. the dual-XXX options seem to be clearer in the promises they make...

gamanakis commented 7 years ago

I tested this on a router running Archlinux with similar settings to the sqm scripts. Cake was compiled from the master branch.

Dave's setup as above:

qdisc cake 8007: dev enp0s13 root refcnt 9 bandwidth 2200Kbit diffserv3 triple-isolate nat rtt 100.0ms noatm overhead 18 via-ethernet 
qdisc cake 8008: dev ifb0 root refcnt 2 bandwidth 10200Kbit diffserv3 triple-isolate nat rtt 100.0ms noatm overhead 18 via-ethernet 

upload download

dual-dsthost on WAN egress, and dual-dsthost nat on WAN ingress:

qdisc cake 800a: dev enp0s13 root refcnt 9 bandwidth 2200Kbit diffserv3 dual-dsthost rtt 100.0ms noatm overhead 18 via-ethernet 
qdisc cake 800c: dev ifb0 root refcnt 2 bandwidth 10200Kbit diffserv3 dual-dsthost nat rtt 100.0ms noatm overhead 18 via-ethernet 

upload download

dual-srchost nat on WAN egress, and dual-dsthost nat on WAN ingress (i.e. per LAN-host fairness)

qdisc cake 800d: dev enp0s13 root refcnt 9 bandwidth 2200Kbit diffserv3 dual-srchost nat rtt 100.0ms noatm overhead 18 via-ethernet 
qdisc cake 800c: dev ifb0 root refcnt 2 bandwidth 10200Kbit diffserv3 dual-dsthost nat rtt 100.0ms noatm overhead 18 via-ethernet

upload download

As you can see, only the last one worked as expected.

dtaht commented 7 years ago

cool that the last worked! Thanks so much for confirming... I really don't grok what the heck triple isolate is supposed to do?!

(BTW - my original testbed boxes were named "rudolf, dancer, prancer, and vixen". Someone thought "vixen" showing up in my test results was not politically correct, so it's now apu2.)

Another test worth doing is emulating bittorrent or steam (lots of different sources going to one internal IP). I'd use the rtt_fair_var tests for these and use flent-freemont, flent-dallas, flent-eu, flent-newark as sources. (20 freaking times), and then see what happens with another box with a single flow.....

https://plus.google.com/u/0/107942175615993706558/posts/3kLiVAMd1cE?sfc=false

gamanakis commented 7 years ago

I tested with rtt_fair, and the third configuration of my last post:

qdisc cake 800d: dev enp0s13 root refcnt 9 bandwidth 2200Kbit diffserv3 dual-srchost nat rtt 100.0ms noatm overhead 18 via-ethernet 
qdisc cake 800c: dev ifb0 root refcnt 2 bandwidth 10200Kbit diffserv3 dual-dsthost nat rtt 100.0ms noatm overhead 18 via-ethernet

LAN-host downloading and uploading to flent-fremont, flent-dallas, flent-eu, flent-newark, flent-tokyo.bufferbloat.net

4 streams for each WAN-host (i.e. I ran the following command simultaneously 4 times on the LAN-host, this created 20 streams to 5 different hosts) flent -4 -H flent-fremont.bufferbloat.net -H flent-dallas.bufferbloat.net -H flent-eu.bufferbloat.net -H flent-newark.bufferbloat.net -H flent-tokyo.bufferbloat.net rtt_fair 20 torr

LAN-host downloading from a single WAN-host, parallel to the above test

apu2.sh single down

LAN-host uploading to a single WAN-host, parallel to the above test

apu2.sh single up

It seems to be working. If you have any other suggestions of testing this, please let me know.

gamanakis commented 7 years ago

This is a similar test as my previous post with 25 streams/host for 5 individual WAN-hosts. for i in {1..25}; do; flent -4 -H flent-fremont.bufferbloat.net -H flent-dallas.bufferbloat.net -H flent-eu.bufferbloat.net -H flent-newark.bufferbloat.net -H flent-tokyo.bufferbloat.net rtt_fair &>/dev/null &; done

WAN-ingress set at 90% of the max achievable download speed

(Comcast cable, advertised as 10mbps/2mbps, max achievable ingress throughput 12700kbps) Torrenting host: 90perc torr

Single flow host: 90perc single

WAN-ingress set at 80% of the max achievable download speed

Torrenting host: 80perc torr Single flow host: 80perc single

Ping times are over 100ms even with the bandwidth set at 80% of the max.

dtaht commented 7 years ago

2mbits up and that many flows.... ugh. It's almost hopeless to get ping times lower on a test this stressful. It would be "interesting" for you to repeat that with sqm off.

dtaht commented 7 years ago

I can confirm now that dual-dsthost nat diffserv3/dual-srchost nat diffsev works beautifully on my tests (at higher bandwidths) on lede head. You do have to set these correctly in the advanced settings.

chromi commented 7 years ago

On 19 Jan, 2017, at 19:53, Dave Täht notifications@github.com wrote:

I can confirm now that dual-dsthost nat diffserv3/dual-srchost nat diffsev works beautifully on my tests (at higher bandwidths)

I’m glad to hear it. But the apparent lack of success with triple-isolate is odd.

Both of the “dual” modes are actually implemented using the triple-isolate mechanism. This assigns a weight to each flow, which is the reciprocal of the number of flows associated with the host at the selected end(s) of the flow. The DRR quantum is simply scaled by this weight (with dithering, because we’re working with integers of similar magnitude). In the case of triple-isolate, the weight is derived from the higher of the two flow counts.

/* triple isolation (modified DRR++) */
srchost = &(b->hosts[flow->srchost]);
dsthost = &(b->hosts[flow->dsthost]);
host_load = 1;

if((q->flow_mode & CAKE_FLOW_DUAL_SRC) == CAKE_FLOW_DUAL_SRC)
    host_load = max(host_load, srchost->srchost_refcnt);

if((q->flow_mode & CAKE_FLOW_DUAL_DST) == CAKE_FLOW_DUAL_DST)
    host_load = max(host_load, dsthost->dsthost_refcnt);

WARN_ON(host_load > CAKE_QUEUES);

/* flow isolation (DRR++) */
if (flow->deficit <= 0) {
    flow->deficit += (b->flow_quantum * quantum_div[host_load] + (prandom_u32() >> 16)) >> 16;
    list_move_tail(&flow->flowchain, &b->old_flows);

In the case of a torrent+single test in triple-isolate mode, the torrenting host should get a weight of 1/N for each of the N flows (regardless of whether they are to N remote hosts or not), while the single host should get a weight of 1 for its flow (as long as it’s not to the same remote host as any of the torrent flows). In theory this should equalise their throughputs, though the presence of sparse measuring flows to either of the local hosts might perturb this.

The results might point to a mistake in the test setup. If the torrenting host is dividing its traffic between a small number M of remote hosts, which set includes the remote hosts used by the single flow, the latter will get a weight of 1/(1+N/M) instead of 1, and a correspondingly lower throughput. This is unlikely for real swarm-type traffic, but it’s entirely possible in a synthetic test.

dtaht commented 7 years ago

I did not retest the triple-isolate feature yet. The test scripts I wrote for testing from two hosts over ipv6 and ipv4 are in this dir (and also in the github repo for the blog). I create an artificial swarm of 6 servers...

http://blog.cerowrt.org/flent/steam/up_working.svg

http://blog.cerowrt.org/flent/steam/steam1.sh and steam6.sh were the tests.

I have to go fix dns for this at some point for the ipv6 tests to work...

2600:3c00::f03c:91ff:fe89:d57 flent-dallas.bufferbloat.net

dtaht commented 7 years ago

hopefully after just having merged everything up (with the exception of the tc fix), I can repeat these tests against triple-isolate.

dtaht commented 7 years ago

at least my version of tc is now consistent. But what went into lede is not, and perhaps hasn't been.

gamanakis commented 7 years ago

Even after the latest commits in sch_cake and tc-adv I am not getting the expected results with triple-isolate on IPv4. Setup:

qdisc cake 8005: dev enp0s14 root refcnt 2 bandwidth 11200Kbit diffserv3 triple-isolate rtt 100.0ms noatm overhead 38 via-ethernet mpu 84 (ethernet)
qdisc cake 8006: dev enp0s13 root refcnt 9 bandwidth 2200Kbit diffserv3 triple-isolate nat rtt 100.0ms noatm overhead 18 via-ethernet mpu 64 (docsis)

triple download

triple upload

chromi commented 7 years ago

On 31 Jan, 2017, at 06:09, George Amanakis notifications@github.com wrote:

Even after the latest commits in sch_cake and tc-adv I am not getting the expected results with triple-isolate on IPv4.

To troubleshoot this, I really need unambiguous information about your test setup.

A configuration which should definitely work is:

If your setup differs at all from the above, in particular if the “single" flow uses a common host with any of the “torrent” flows at ether end, then you have a faulty test and are likely to get confusing results.

I can calculate the expected results, according to my explanation earlier in this thread, if I know the full details of the actual test.

ldir-EDB0 commented 7 years ago

'nat' and 'triple-isolate' has always worked fine for me. I've done simple IPv4 tests both in up & download directions e.g. A test that effectively occurs nightly..... Router M with nat & triple-isolate on wan interface. Host A doing backup, generates 6 flows usually 4 to server X, 2 to server Y, consumes all link capacity. Host B does a speedtest and obtains 1/2 the egress bandwidth rather than 1/7th. If either nat or the host-flow fairness of triple-isolate were not working then host A would get 6/7ths bandwidth and host B 1/7th. Add a host C and everyone gets 1/3 the egress bandwidth.

It is actually quite beautiful to watch the smoothness of the DRR+ and host fairness algorithms reign in the 'greedier' hosts/flows to fair share bandwidth.

LEDE & Cake has always worked like that for me and been wonderfully fair.

moeller0 commented 7 years ago

Hi Kevin,

On Jan 31, 2017, at 11:44, Kevin Darbyshire-Bryant notifications@github.com wrote:

'nat' and 'triple-isolate' has always worked fine for me. I've done simple IPv4 tests both in up & download directions e.g. A test that effectively occurs nightly..... Router M with nat & triple-isolate on wan interface. Host A doing backup, generates 6 flows usually 4 to server X, 2 to server Y, consumes all link capacity. Host B does a speedtest and obtains 1/2 the egress bandwidth rather than 1/7th. If either nat or the host-flow fairness of triple-isolate were not working then host A would get 6/7ths bandwidth and host B 1/7th. Add a host C and everyone gets 1/3 the egress bandwidth.

I guess the biggest challenge is to understand which fairness guarantees triple-isolate actually gives. I believe that naively users expect strict per internal host ip fairness, that as we know can be achieved with either a combo of dual-dsthost and dual-srchost or approximately  with triple-isolate (if the internal and external sets of IPs have no overlap and are mostly 1:1). If the IP sets start to overlap, triple-isolate will not closely approximate neither per-internal nor per-external-IP fairness but something in the middle between the two. 

And all of that is good engineering and quite fine and a decent price to pay for not having to specify the directionality. What is lacking a bit is more information so that cake users can intuitively understand the differences between the options.

If we could find a simple way to explain the principle in a way that people can easily make good predictions about how triple-isolate will behave, we would help a lot of users. That way the strict internal-fairness camp should quickly realize that triple-isolate is not the keyword they are looking for, while the as-long-as-no-side-can-hog-the-link-I-am-happy camp should see that triple-isolate pretty much gurantees that.

Best Regards Sebastian

It is actually quite beautiful to watch the smoothness of the DRR+ and host fairness algorithms reign in the 'greedier' hosts/flows to fair share bandwidth.

LEDE & Cake has always worked like that for me and been wonderfully fair.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

ldir-EDB0 commented 7 years ago

"That way the strict internal-fairness camp should quickly realize that triple-isolate is not the keyword they are looking for, while the as-long-as-no-side-can-hog-the-link-I-am-happy camp should see that triple-isolate pretty much gurantees that." is the best description I've seen!

And just to clear up any doubt.... I've no idea how it bloody well works either ;-)

gamanakis commented 7 years ago
  • Router M, running both NAT and Cake, situated between the two above groups. Both ingress and egress Cake instances have the “nat” mode turned on.

@chromi I missed this part, my LAN-egress side on enp0s14 doesn't have "nat" mode turned on. I will retest tonight, and let you know.

gamanakis commented 7 years ago

@chromi I tested with triple-isolate and "nat" mode on both WAN and LAN interfaces with the same results, i.e. the machine creating only 1 stream gets 1/9 of the bandwidth, the machine creating 8 streams gets 8/9 of the bandwidth. Ping times remain very acceptable for both machines during the test.

I used the scripts apu2.sh and dancer.sh @dtaht posted above.

"apu2" connects to flent-fremont.bufferbloat.net, and downloads 1 tcp-stream. flent -4 --te=download_streams=1 -H flent-fremont.bufferbloat.net -t ipv4-flows_1-apu2-nat tcp_ndown

"dancer" connects to flent-fremont.bufferbloat.net, and downloads 8 tcp-streams simultaneously. flent -4 --te=download_streams=8 -H flent-fremont.bufferbloat.net -t ipv4-flows_8-dancer-nat tcp_ndown

The "router" is positioned between the "apu2, dancer" group (LAN) and "flent-fremont.bufferbloat.net" (WAN). It is configured with: WAN-interface: tc qdisc add dev enp0s13 root cake bandwidth 2200kbit diffserv3 triple-isolate nat noatm docsis LAN-interface: tc qdisc add dev enp0s14 root cake bandwidth 11200kbit triple-isolate nat noatm ethernet

There is a notable difference between your setup and mine: both "apu2" and "dancer" connect to the same remote host "flent-fremont.bufferbloat.net". Would this justify the results I am getting?

chromi commented 7 years ago

On 1 Feb, 2017, at 18:59, George Amanakis notifications@github.com wrote:

"apu2" connects to flent-fremont.bufferbloat.net, and downloads 1 tcp-stream. flent -4 --te=download_streams=1 -H flent-fremont.bufferbloat.net -t ipv4-flows_1-apu2-nat tcp_ndown

"dancer" connects to flent-fremont.bufferbloat.net, and downloads 8 tcp-streams simultaneously. flent -4 --te=download_streams=8 -H flent-fremont.bufferbloat.net -t ipv4-flows_8-dancer-nat tcp_ndown

the machine creating only 1 stream gets 1/9 of the bandwidth, the machine creating 8 streams gets 8/9 of the bandwidth.

Then that is expected behaviour for triple-isolate, because all of the flows have the same remote endpoint (X instead of P-and-X or P-to-X). This is not typical of real Internet traffic.

You would need to use the “dual” modes to ignore the remote host identity and consider only the local host, in order to get the behaviour you’re looking for with this test.