oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
252 stars 40 forks source link

Substantial workload performance degradation #7189

Open askfongjojo opened 4 days ago

askfongjojo commented 4 days ago

A certain workload I've been using for release-to-release performance comparison shows major degradation. The workload comprises a load generator running YCSB and a MongoDB clusters with 3 nodes. They are located on 4 different sleds and the traffic among them is confined to the VPC they are on:

root@oxz_switch0:~# omdb db instance list | egrep 'mongo|loadgen'
6b252c74-548c-4f1b-a98a-6c0e2da02c09 running 26ff55b2-cd2d-4772-84b5-35738b014bfb 5f6720b8-8a31-45f8-8c94-8e699218f28b BRM42220017 mongodb-xfs-bs4096-primary          
9e2451c0-6eec-4b90-b49a-6a9a2e02f35f running 38206026-242c-48a0-9fd3-c570d75785e6 a2adea92-b56e-44fc-8a0d-7d63b5fd3b93 BRM42220031 loadgen                             
e97e9fb9-62c3-4745-9d91-b0b6fa2baeba running 0bd90ecf-b1dc-47f1-a527-728aaf560992 7b862eb6-7f50-4c2f-b9a6-0d12ac913d3c BRM44220010 mongodb-xfs-bs4096-secondary2       
fa658925-3e4d-4e76-b839-53a62beb4f5d running 30cb91f3-6b2c-4afb-86b2-70cb24608c75 f15774c1-b8e5-434f-a493-ec43f96cba06 BRM44220005 mongodb-xfs-bs4096-secondary1

These were the typical rates of INSERT previously, on omicron commit e7d32ae2375b0231193f1dc84271f900915b2d6b (the workload took no more than 3 mins to complete):

2024-11-25 22:54:19:811 40 sec: 168557 operations; 4429.2 current ops/sec; est completion in 3 minutes [INSERT: Count=44290, Max=114751, Min=281, Avg=449.16, 90=547, 99=753, 99.9=2283, 99.99=18431] 
2024-11-25 22:54:29:811 50 sec: 210618 operations; 4206.1 current ops/sec; est completion in 3 minutes [INSERT: Count=42062, Max=280063, Min=269, Avg=473.32, 90=580, 99=821, 99.9=1341, 99.99=8823] 
2024-11-25 22:54:39:811 60 sec: 255041 operations; 4442.3 current ops/sec; est completion in 2 minutes [INSERT: Count=44422, Max=11159, Min=266, Avg=448.4, 90=555, 99=750, 99.9=1313, 99.99=4255] 
2024-11-25 22:54:49:811 70 sec: 297718 operations; 4267.7 current ops/sec; est completion in 2 minutes [INSERT: Count=42676, Max=143487, Min=268, Avg=466.57, 90=569, 99=794, 99.9=2255, 99.99=50847] 
2024-11-25 22:54:59:811 80 sec: 340315 operations; 4259.7 current ops/sec; est completion in 2 minutes [INSERT: Count=42598, Max=12839, Min=280, Avg=467.39, 90=575, 99=759, 99.9=1898, 99.99=5727] 
2024-11-25 22:55:09:811 90 sec: 385175 operations; 4486 current ops/sec; est completion in 2 minutes [INSERT: Count=44862, Max=15335, Min=268, Avg=443.99, 90=553, 99=740, 99.9=2113, 99.99=10047] 
2024-11-25 22:55:19:811 100 sec: 430592 operations; 4541.7 current ops/sec; est completion in 2 minutes [INSERT: Count=45416, Max=11783, Min=260, Avg=438.5, 90=546, 99=743, 99.9=1599, 99.99=10327] 
2024-11-25 22:55:29:811 110 sec: 473296 operations; 4270.4 current ops/sec; est completion in 2 minutes [INSERT: Count=42703, Max=106175, Min=272, Avg=466.14, 90=577, 99=790, 99.9=1572, 99.99=10311] 
2024-11-25 22:55:39:811 120 sec: 517959 operations; 4466.3 current ops/sec; est completion in 1 minute [INSERT: Count=44663, Max=238207, Min=275, Avg=445.7, 90=538, 99=731, 99.9=1776, 99.99=11447] 
2024-11-25 22:55:49:811 130 sec: 561913 operations; 4395.4 current ops/sec; est completion in 1 minute [INSERT: Count=43954, Max=113727, Min=264, Avg=453.1, 90=552, 99=731, 99.9=1430, 99.99=19023] 
2024-11-25 22:5

The same workload on omicron commit 41d7c9b0c110e6d3690bf96bb969b74f8c385bf6 runs more than 40 times slower (it's been running for two hours and still hasn't completed):

2024-11-28 03:08:55:912 100 sec: 6182 operations; 53.6 current ops/sec; est completion in 4 hours 27 minutes [INSERT: Count=536, Max=50207, Min=461, Avg=37272.15, 90=40159, 99=40415, 99.9=41119, 99.99=50207] 
2024-11-28 03:09:05:912 110 sec: 7202 operations; 102 current ops/sec; est completion in 4 hours 12 minutes [INSERT: Count=1020, Max=40607, Min=441, Avg=19651.31, 90=40063, 99=40255, 99.9=40447, 99.99=40607] 
2024-11-28 03:09:15:912 120 sec: 9066 operations; 186.4 current ops/sec; est completion in 3 hours 38 minutes [INSERT: Count=1864, Max=40383, Min=366, Avg=10713.42, 90=39903, 99=40159, 99.9=40319, 99.99=40383] 
2024-11-28 03:09:25:912 130 sec: 11088 operations; 202.2 current ops/sec; est completion in 3 hours 13 minutes [INSERT: Count=2022, Max=40383, Min=385, Avg=9881, 90=39839, 99=40159, 99.9=40319, 99.99=40383] 
2024-11-28 03:09:35:912 140 sec: 13059 operations; 197.1 current ops/sec; est completion in 2 hours 56 minutes [INSERT: Count=1971, Max=40255, Min=372, Avg=10131.55, 90=39871, 99=40191, 99.9=40255, 99.99=40255] 
2024-11-28 03:09:45:912 150 sec: 15081 operations; 202.2 current ops/sec; est completion in 2 hours 43 minutes [INSERT: Count=2022, Max=43967, Min=383, Avg=9880.58, 90=39871, 99=40159, 99.9=40351, 99.99=43967] 
2024-11-28 03:09:55:912 160 sec: 17069 operations; 198.8 current ops/sec; est completion in 2 hours 33 minutes [INSERT: Count=1988, Max=43551, Min=368, Avg=10050.21, 90=39839, 99=40191, 99.9=40447, 99.99=43551] 
2024-11-28 03:10:05:912 170 sec: 19017 operations; 194.8 current ops/sec; est completion in 2 hours 26 minutes [INSERT: Count=1948, Max=44767, Min=395, Avg=10253.35, 90=39871, 99=40191, 99.9=40511, 99.99=44767] 
2024-11-28 03:10:15:912 180 sec: 21036 operations; 201.9 current ops/sec; est completion in 2 hours 19 minutes [INSERT: Count=2020, Max=40383, Min=370, Avg=9909.67, 90=39839, 99=40159, 99.9=40351, 99.99=40383] 
2024-11-28 03:10:25:913 190 sec: 23076 operations; 204 current ops/sec; est completion in 2 hours 14 minutes [INSERT: Count=2039, Max=43007, Min=382, Avg=9797.07, 90=39871, 99=40159, 99.9=40287, 99.99=43007] 
2024-11-28 03:10:35:912 200 sec: 25061 operations; 198.5 current ops/sec; est completion in 2 hours 9 minutes [INSERT: Count=1985, Max=40287, Min=358, Avg=10064.11, 90=39871, 99=40191, 99.9=40287, 99.99=40287] 
2024-11-28 03:10:45:912 210 sec: 27041 operations; 198 current ops/sec; est completion in 2 hours 5 minutes [INSERT: Count=1980, Max=40575, Min=391, Avg=10088.1, 90=39871, 99=40159, 99.9=40319, 99.99=40575] 
2024-11-28 03:10:55:913 220 sec: 29075 operations; 203.4 current ops/sec; est completion in 2 hours 2 minutes [INSERT: Count=2035, Max=41311, Min=379, Avg=9818.11, 90=39839, 99=40159, 99.9=40415, 99.99=41311] 

I ran a fio regression test and the disk I/O numbers are roughly the same between the two commits. I'll check the VPC network throughput next to see if there is any change .

askfongjojo commented 3 days ago

Intra-VPC single-thread iperf3 throughput and number of retransmits are still on par with the previous build on rack2. Another disk I/O SQL server workload (with sysbench as load generator co-located with the database in the same VM) also hasn't shown perf degradation.

I also checked the TCP session queues on the load generator and MongoDB primary. The loadgen send queue length for each of the threads stays between 0-1400 requests and those numbers haven't increased whereas the DB primary has a very small queue length:

ubuntu@loadgen:/opt/local/ycsb/ycsb-mongodb-binding-0.17.0$ netstat -an | egrep 'Address|ESTABLISH'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0     36 172.30.0.6:22           172.20.17.42:51281      ESTABLISHED
tcp6       0   1393 172.30.0.6:51728        172.30.0.7:27017        ESTABLISHED
tcp6       0      0 172.30.0.6:51712        172.30.0.7:27017        ESTABLISHED
tcp6       0   1393 172.30.0.6:51724        172.30.0.7:27017        ESTABLISHED

ubuntu@primary:~$ netstat -an | egrep 'Address|ESTABLISH'
Proto Recv-Q Send-Q Local Address           Foreign Address         State      
tcp        0      0 127.0.1.1:27017         127.0.0.1:54468         ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.6:42206        ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:59938       ESTABLISHED
tcp        0      0 172.30.0.7:43630        172.30.0.50:27017       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50604       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50590       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:59916       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50566       ESTABLISHED
tcp        0      0 172.30.0.7:44978        172.30.0.51:27017       ESTABLISHED
tcp        0      0 172.30.0.7:36776        172.30.0.50:27017       ESTABLISHED
tcp        0      0 127.0.1.1:27017         127.0.0.1:54452         ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:59922       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50558       ESTABLISHED
tcp        0      0 127.0.0.1:54468         127.0.1.1:27017         ESTABLISHED
tcp        0      0 127.0.0.1:40132         127.0.1.1:27017         ESTABLISHED
tcp        0     36 172.30.0.7:22           172.20.17.42:52164      ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:59910       ESTABLISHED
tcp        0      0 127.0.0.1:54452         127.0.1.1:27017         ESTABLISHED
tcp        0      0 172.30.0.7:44992        172.30.0.51:27017       ESTABLISHED
tcp        0      0 172.30.0.7:43638        172.30.0.50:27017       ESTABLISHED
tcp        0      0 172.30.0.7:43618        172.30.0.50:27017       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.6:42204        ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.6:42216        ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50620       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:50834       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50582       ESTABLISHED
tcp        0      0 172.30.0.7:45008        172.30.0.51:27017       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.50:50586       ESTABLISHED
tcp        0      0 172.30.0.7:27017        172.30.0.51:59896       ESTABLISHED
tcp        0      0 172.30.0.7:45022        172.30.0.51:27017       ESTABLISHED
tcp        0      0 127.0.1.1:27017         127.0.0.1:40132         ESTABLISHED

One thing of interest is that the READ and UPDATE latency/IOPS of the MongoDB workload are still on par with the previous runs. INSERT is the only type of transaction that has degraded. I think crucible is mostly cleared as the source of issue so I'm moving this issue into omicron.

askfongjojo commented 3 days ago

Also dumping the otpe stats here although I haven't observed anything out of the ordinary.

BRM42220031 # opteadm dump-layer -p opte7 nat
Port opte7 - Layer nat
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS  PREDICATES                 ACTION
6    10   13    inner.ip.dst=172.20.26.17  "Stateful: 172.30.0.6 <=> (external)"
DEF  --   414   --                         "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS  PREDICATES                                                   ACTION
18   10   34    inner.ether.ether_type=IPv4                                  "Stateful: 172.30.0.6 <=> 172.20.26.17"
                meta: router-target=ig=a4361b0b-b461-4674-a9ea-80296755f302  

19   100  0     inner.ether.ether_type=IPv4                                  "Stateful: 172.20.26.13:16384-32767"
                meta: router-target=ig=a4361b0b-b461-4674-a9ea-80296755f302  

20   255  0     meta: router-target-class=ig                                 "Deny"
DEF  --   477   --                                                           "allow"

BRM42220031 # opteadm dump-layer -p opte7 router
Port opte7 - Layer router
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS  PREDICATES  ACTION
DEF  --   534   --          "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS  PREDICATES                              ACTION
4    27   0     inner.ip.dst=192.168.96.0/24            "Meta: Target = Subnet: 192.168.96.0/24"
2    27   0     inner.ip.dst=192.168.64.0/24            "Meta: Target = Subnet: 192.168.64.0/24"
0    27   0     inner.ip.dst=192.168.32.0/24            "Meta: Target = Subnet: 192.168.32.0/24"
3    31   477   inner.ip.dst=172.30.0.0/22              "Meta: Target = Subnet: 172.30.0.0/22"
5    75   120   inner.ip.dst=0.0.0.0/0                  "Meta: Target = IG(Some(a4361b0b-b461-4674-a9ea-80296755f302))"
9    139  0     inner.ip6.dst=fddb:bb4e:6c24:62cc::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:62cc::/64"
7    139  0     inner.ip6.dst=fddb:bb4e:6c24:7631::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:7631::/64"
6    139  0     inner.ip6.dst=fddb:bb4e:6c24:168c::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:168c::/64"
1    139  0     inner.ip6.dst=fddb:bb4e:6c24::/64       "Meta: Target = Subnet: fddb:bb4e:6c24::/64"
8    267  0     inner.ip6.dst=::/0                      "Meta: Target = IG(Some(a4361b0b-b461-4674-a9ea-80296755f302))"
DEF  --   0     --                                      "deny"
BRM42220017 #  opteadm dump-layer -p opte8 nat
Port opte8 - Layer nat
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES                  ACTION
13   10   1      inner.ip.dst=172.20.26.195  "Stateful: 172.30.0.7 <=> (external)"
DEF  --   12221  --                          "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES                                                   ACTION
39   10   10     inner.ether.ether_type=IPv4                                  "Stateful: 172.30.0.7 <=> 172.20.26.195"
                 meta: router-target=ig=a4361b0b-b461-4674-a9ea-80296755f302  

40   100  0      inner.ether.ether_type=IPv4                                  "Stateful: 172.20.26.192:32768-49151"
                 meta: router-target=ig=a4361b0b-b461-4674-a9ea-80296755f302  

41   255  0      meta: router-target-class=ig                                 "Deny"
DEF  --   12237  --                                                           "allow"

BRM42220017 #  opteadm dump-layer -p opte8 router 
Port opte8 - Layer router
======================================================================
Inbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Outbound Flows
----------------------------------------------------------------------
PROTO  SRC IP  SPORT  DST IP  DPORT  HITS  ACTION

Inbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES  ACTION
DEF  --   12362  --          "allow"

Outbound Rules
----------------------------------------------------------------------
ID   PRI  HITS   PREDICATES                              ACTION
8    27   0      inner.ip.dst=192.168.96.0/24            "Meta: Target = Subnet: 192.168.96.0/24"
4    27   0      inner.ip.dst=192.168.64.0/24            "Meta: Target = Subnet: 192.168.64.0/24"
2    27   0      inner.ip.dst=192.168.32.0/24            "Meta: Target = Subnet: 192.168.32.0/24"
0    31   12257  inner.ip.dst=172.30.0.0/22              "Meta: Target = Subnet: 172.30.0.0/22"
3    75   122    inner.ip.dst=0.0.0.0/0                  "Meta: Target = IG(Some(a4361b0b-b461-4674-a9ea-80296755f302))"
7    139  0      inner.ip6.dst=fddb:bb4e:6c24:62cc::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:62cc::/64"
6    139  0      inner.ip6.dst=fddb:bb4e:6c24:168c::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:168c::/64"
5    139  0      inner.ip6.dst=fddb:bb4e:6c24::/64       "Meta: Target = Subnet: fddb:bb4e:6c24::/64"
1    139  0      inner.ip6.dst=fddb:bb4e:6c24:7631::/64  "Meta: Target = Subnet: fddb:bb4e:6c24:7631::/64"
9    267  0      inner.ip6.dst=::/0                      "Meta: Target = IG(Some(a4361b0b-b461-4674-a9ea-80296755f302))"
DEF  --   0      --                                      "deny"