dtaht / sch_cake

Out of tree build for the new cake qdisc
100 stars 35 forks source link

Cake is performing terribly #147

Closed jaswor closed 1 year ago

jaswor commented 3 years ago

See this discussion over at https://forum.openwrt.org/t/cake-x86-memory-usage/82032/17 Gaming performance is erratic, and buffering issues with streaming.

Router = Qotom J1900 4 port x64 w/ 8GB ram and 64GB SSD w/o wifi (Archer C7 v4 as a dumb ap) ISP = AT&T (VDSL2 no PPPOE) Rates = CIR 50Mb down and 10Mb up

root@OpenWrt:~# tc -s qdisc show dev eth0
qdisc cake 809a: root refcnt 9 bandwidth 10Mbit besteffort dual-srchost nat nowash no-ack-filter split-gso rtt 100ms noatm overhead 42 mpu 64
 Sent 42187227 bytes 88848 pkt (dropped 262, overlimits 88744 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 688896b of 4Mb
 capacity estimate: 10Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       70 /    1542
 average network hdr offset:           14

                  Tin 0
  thresh         10Mbit
  target            5ms
  interval        100ms
  pk_delay        145ms
  av_delay       34.1ms
  sp_delay          2us
  backlog            0b
  pkts            89110
  bytes        42579936
  way_inds            0
  way_miss          212
  way_cols            0
  drops             262
  marks              14
  ack_drop            0
  sp_flows            0
  bk_flows            1
  un_flows            0
  max_len          1514
  quantum           305
chromi commented 3 years ago

The forum thread seems to be private, so I can't readily see it. It would be helpful if you could explain the problem more completely here.

To get the full benefit, you need Cake applied for both ingress and egress. You should have an ifb device where the ingress instance is attached, probably set to a bit under 50Mbit instead of 10Mbit as above.

Note also that congestion within your ISP's backhaul networks may occur at speeds below that of your last-mile connection. If that happens, setting Cake to your last-mile link's parameters might not be sufficient to gain control of the bottleneck.

moeller0 commented 3 years ago

Hi Jonathan,

On Dec 20, 2020, at 14:33, Jonathan Morton notifications@github.com wrote:

If that happens, setting Cake to your last-mile link's parameters might not be sufficient to gain control of the bottleneck.

But that would not really explain a pk_delay of 145ms on ingress?

Best Regards Sebastian

chromi commented 3 years ago

On 20 Dec, 2020, at 3:46 pm, moeller0 notifications@github.com wrote:

But that would not really explain a pk_delay of 145ms on ingress?

We're looking at the egress stats here. They look reasonable enough.

yutayu commented 3 years ago

Tin 0 thresh 809Kbit target 22.6ms interval 51.1ms pk_delay 16.1ms av_delay 1.5ms sp_delay 12us

These are mine , and these are for low bandwidth, yours should be lower.

yutayu commented 3 years ago

@jaswor try ack-filter.

xnoreq commented 3 years ago

Ack-filter may or may not help. As suggested in #145 certain traffic patterns could starve cake. (Didn't have time to follow up on that, but I may have more time in the upcoming weeks.)

@jaswor Are you running bittorrent using UTP (UDP-based transfer protocol) or an OpenVPN tunnel in UDP mode or Wireguard tunnel (also using UDP) or anything like that when you're seeing those problems?

jaswor commented 3 years ago

@xnoreq

No I'm not using any of those... I installed v19.07.5

Here's a fresh sqm restart & nperf.com speed test:

root@OpenWrt:~# tc -s qdisc show dev eth0
qdisc cake 801d: root refcnt 9 bandwidth 9750Kbit besteffort dual-srchost nat nowash ack-filter-aggressive split-gso rtt 100.0ms noatm overhead 18 mpu 64
 Sent 22793971 bytes 44292 pkt (dropped 6, overlimits 43132 requeues 1)
 backlog 0b 0p requeues 1
 memory used: 1494Kb of 4Mb
 capacity estimate: 9750Kbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                  Tin 0
  thresh       9750Kbit
  target          5.0ms
  interval      100.0ms
  pk_delay      488.7ms
  av_delay      483.1ms
  sp_delay          2us
  backlog            0b
  pkts            44298
  bytes        22795815
  way_inds            0
  way_miss          108
  way_cols            0
  drops               4
  marks           13114
  ack_drop            2
  sp_flows           19
  bk_flows            1
  un_flows            0
  max_len         17198
  quantum           300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 88340207 bytes 66870 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
root@OpenWrt:~# tc -s qdisc show dev ifb4eth0
qdisc cake 801e: root refcnt 2 bandwidth 45Mbit besteffort dual-dsthost nat wash ingress no-ack-filter split-gso rtt 100.0ms noatm overhead 18 mpu 64
 Sent 89636452 bytes 67021 pkt (dropped 0, overlimits 103391 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 405608b of 4Mb
 capacity estimate: 45Mbit
 min/max network layer size:           46 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                  Tin 0
  thresh         45Mbit
  target          5.0ms
  interval      100.0ms
  pk_delay        519us
  av_delay         26us
  sp_delay          3us
  backlog            0b
  pkts            67021
  bytes        89636452
  way_inds            0
  way_miss          145
  way_cols            0
  drops               0
  marks             409
  ack_drop            0
  sp_flows            1
  bk_flows            1
  un_flows            0
  max_len          6056
  quantum          1373
xnoreq commented 3 years ago

About 20 MB of upload on the speedtest seems reasonable, but over 13000 marks? That looks like almost all packets were marked. Could it be that the OS you're using has a broken ECN implementation? Or the speedtest server itself? Try a different client / OS / disabling ECN on the machine you run the speedtest. Also make sure that the rest of the network is reasonably idle during your testing.

chromi commented 3 years ago

On 22 Dec, 2020, at 1:11 am, Jason Woringen notifications@github.com wrote:

thresh 9750Kbit target 5.0ms interval 100.0ms pk_delay 488.7ms av_delay 483.1ms

That is definitely not normal, and suggests to me that there is a fault on the Ethernet port or the cable that is preventing a steady delivery of packets from it. My first instinct here would be to try a different cable.

yutayu commented 3 years ago

@jaswor Or you should check whether you set certain bandwidth setting. Mine setting is 95% of upload link speed.

jaswor commented 3 years ago

This doesn't seem too good as I'm not seeing an improvement. The cable has been swapped with a shielded cable (my preference). Even a barebones and very basic setup of openwrt/sqm and fq/simplest.qos doesn't change things...

chromi commented 3 years ago

On 23 Dec, 2020, at 3:13 pm, Jason Woringen notifications@github.com wrote:

This doesn't seem too good as I'm not seeing an improvement. The cable has been swapped with a shielded cable (my preference). Even a barebones and very basic setup of openwrt/sqm and fq/simplest.qos doesn't change things...

For the sake of troubleshooting, try significantly reducing the bandwidth settings, to 12Mbit down and 4Mbit up. That should ensure that we really do have control of the bottleneck, while still giving enough bandwidth for most uses.

If you still have trouble with those settings, it'll be clear that the problem is not in Cake, but somewhere in either your hardware or your ISP.

jaswor commented 3 years ago

I will try that in a moment if this post doesn't resolve anything. I have tried disabling Energy Efficient Ethernet via ethtool.

root@OpenWrt:~# tc -s qd show dev eth0
qdisc cake 8019: root refcnt 9 bandwidth 10Mbit diffserv3 triple-isolate nat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 18 mpu 64
 Sent 42031940 bytes 92779 pkt (dropped 3649, overlimits 100705 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 628992b of 4Mb
 capacity estimate: 10Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1518
 average network hdr offset:           14

                   Bulk  Best Effort        Voice
  thresh        625Kbit       10Mbit     2500Kbit
  target         29.1ms        5.0ms        7.3ms
  interval      124.1ms      100.0ms      102.3ms
  pk_delay          0us       60.2ms        1.6ms
  av_delay          0us       26.2ms        157us
  sp_delay          0us          2us         20us
  backlog            0b           0b           0b
  pkts                0        96335           93
  bytes               0     47534710        13914
  way_inds            0            0            0
  way_miss            0          132            4
  way_cols            0            0            0
  drops               0         3649            0
  marks               0           36            0
  ack_drop            0            0            0
  sp_flows            0            7            0
  bk_flows            0            0            1
  un_flows            0            0            0
  max_len             0         1514          155
  quantum           300          305          300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 185244027 bytes 146373 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
jaswor commented 3 years ago

The above was completed at fast.com 30/30 testing.

jaswor commented 3 years ago

I'm clueless

root@OpenWrt:~# tc -s qd show dev eth0
qdisc cake 802d: root refcnt 9 bandwidth 5Mbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 24 mpu 64
 Sent 21776584 bytes 46957 pkt (dropped 0, overlimits 41322 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 3969Kb of 4Mb
 capacity estimate: 5Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1524
 average network hdr offset:           14

                   Bulk  Best Effort        Voice
  thresh      312496bit        5Mbit     1250Kbit
  target         58.1ms        5.0ms       14.5ms
  interval      153.1ms      100.0ms      109.5ms
  pk_delay          0us      667.4ms          1us
  av_delay          0us      572.2ms          0us
  sp_delay          0us          9us          0us
  backlog            0b           0b           0b
  pkts                0        46954            3
  bytes               0     21776458          126
  way_inds            0            0            0
  way_miss            0          128            1
  way_cols            0            0            0
  drops               0            0            0
  marks               0        11931            0
  ack_drop            0            0            0
  sp_flows            0            8            0
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1514           42
  quantum           300          300          300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 95639677 bytes 72740 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
yutayu commented 3 years ago

@jaswor Can you ping 8.8.8.8 when you are measuring and not , and show us results?

jaswor commented 3 years ago

BEFORE

root@OpenWrt:~# tc -s qdisc show dev eth0
qdisc cake 803d: root refcnt 9 bandwidth 10Mbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 24 mpu 64
 Sent 666 bytes 7 pkt (dropped 0, overlimits 2 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 2304b of 4Mb
 capacity estimate: 10Mbit
 min/max network layer size:           52 /     144
 min/max overhead-adjusted size:       76 /     168
 average network hdr offset:            0

                   Bulk  Best Effort        Voice
  thresh        625Kbit       10Mbit     2500Kbit
  target         29.1ms        5.0ms        7.3ms
  interval      124.1ms      100.0ms      102.3ms
  pk_delay          0us          1us         25us
  av_delay          0us          0us          0us
  sp_delay          0us          0us          0us
  backlog            0b           0b           0b
  pkts                0            4            3
  bytes               0          264          402
  way_inds            0            0            0
  way_miss            0            2            2
  way_cols            0            0            0
  drops               0            0            0
  marks               0            0            0
  ack_drop            0            0            0
  sp_flows            0            1            1
  bk_flows            0            0            0
  un_flows            0            0            0
  max_len             0           66          158
  quantum           300          305          300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 668 bytes 9 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

AFTER

root@OpenWrt:~# tc -s qdisc show dev eth0
qdisc cake 803d: root refcnt 9 bandwidth 10Mbit diffserv3 dual-srchost nat nowash no-ack-filter split-gso rtt 100.0ms noatm overhead 24 mpu 64
 Sent 22893887 bytes 47608 pkt (dropped 0, overlimits 53646 requeues 0)
 backlog 0b 0p requeues 0
 memory used: 1143Kb of 4Mb
 capacity estimate: 10Mbit
 min/max network layer size:           28 /    1500
 min/max overhead-adjusted size:       64 /    1524
 average network hdr offset:           14

                   Bulk  Best Effort        Voice
  thresh        625Kbit       10Mbit     2500Kbit
  target         29.1ms        5.0ms        7.3ms
  interval      124.1ms      100.0ms      102.3ms
  pk_delay          0us        4.2ms        462us
  av_delay          0us        4.1ms         29us
  sp_delay          0us          6us         15us
  backlog            0b           0b           0b
  pkts                0        47564           44
  bytes               0     22888627         5260
  way_inds            0            2            0
  way_miss            0           79           14
  way_cols            0            0            0
  drops               0            0            0
  marks               0        12992            0
  ack_drop            0            0            0
  sp_flows            0           12            1
  bk_flows            0            1            0
  un_flows            0            0            0
  max_len             0         1514          158
  quantum           300          305          300

qdisc ingress ffff: parent ffff:fff1 ----------------
 Sent 92724648 bytes 71000 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

Windows 10 MTR RESULTS

Host % Sent Recv Best Avrg Wrst Last
192.168.2.1 9 83 76 0 0 9 0
192.168.1.254 9 83 76 0 0 13 0
99.113.132.1 0 143 143 22 27 62 24
99.55.24.120 0 143 143 27 29 56 28
12.123.239.98 0 143 143 37 42 47 41
12.122.28.29 0 143 143 38 43 71 43
12.122.141.221 0 143 143 37 39 52 38
12.247.147.22 0 143 143 37 38 50 38
209.85.252.161 0 143 143 37 39 44 38
108.170.225.127 0 143 143 37 39 66 38
8.8.8.8 0 143 143 37 39 67 38
yutayu commented 3 years ago

@jaswor I mean let us show ping TTL .

  1. not traffic 2.traffic like speed test.

It shows bufferbloat or not. Excuse me.

jaswor commented 3 years ago

I switched to my archer c7 v4, and the problem still persists in v19.07.05
It may be either the modem or fimware. I'll try 19.07.02 and see if that makes a difference. I'm leaning towards the modem or further down the line.

dtaht commented 2 years ago

this has been a while, any status?

jaswor commented 2 years ago

I’ve since switched to from AT&T Uverse to CableOne/Sparklight Docsis3.1 which uses fqpie modified for cable internet.it seems to handle all our streaming and gaming needs quite well while keeping rtt at very low levels while streaming & gaming. I’ll give cake another go once I’ve become more settled into the new home.

dtaht commented 2 years ago

Cool. I'd not heard of sparklight before now. DOCSIS 3.1 contains "pie" as part of the standard, while other AQMs are optional. I think highly of fq-pie, but I was not aware anyone had tried to deploy it. Are you sure fq-pie is in there? What modem is it exactly?

jaswor commented 2 years ago

I’m assuming it uses Docsis-Pie. It’s an Arris DG3450.

dtaht commented 1 year ago

I am curious if you gave it a try again, otherwise, I'm closing this bug.