sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
728 stars 1.39k forks source link

[warmboot] warmboot LACP downtime variation #17581

Open stepanblyschak opened 9 months ago

stepanblyschak commented 9 months ago

Description

Steps to reproduce the issue:

  1. Run upgrade_path sonic-mgmt test in warm mode from 202205 to 202305

Describe the results you received:

Depending on whether LAG is created on first ports (Ethernet0, Ethernet2) or last one it will have different restoration time: e.g:

Dec 20 18:01:42.785263 sonic NOTICE swss#orchagent: :- initPort: Initialized port Ethernet0
Dec 20 18:01:47.467496 sonic NOTICE swss#orchagent: :- initPort: Initialized port Ethernet126

It takes ~5 sec to init all 56 ports. Roughly 0.1 sec per port. The initPort() in orchagent does query many attributes and port capabilities which takes time to serialize/deserialize back and forth as well as does not scale as more features are added.

E.g: https://github.com/sonic-net/sonic-swss/blob/202305/orchagent/portsorch.cpp#L4959

We can optimize this by creating host interfaces as early as possible and query rest of the things later.

Describe the results you expected:

Stable under 90 sec with ~20 sec headroom.

Output of show version:

(paste your output here)

Output of show techsupport:

SONiC Software Version: SONiC.202305_RC.55-16d7da84c_Internal
SONiC OS Version: 11
Distribution: Debian 11.8
Kernel: 5.10.0-23-2-amd64
Build commit: 16d7da84c
Build date: Wed Dec 20 01:04:07 UTC 2023
Built by: sw-r2d2-bot@r-build-sonic-ci02-244

Platform: x86_64-mlnx_msn2700-r0
HwSKU: Mellanox-SN2700-D40C8S8
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1805K20439
Model Number: MSN2700-CS2F
Hardware Revision: A2
Uptime: 18:20:06 up 19 min,  1 user,  load average: 1.15, 1.19, 1.26
Date: Wed 20 Dec 2023 18:20:06

Docker images:
REPOSITORY                                         TAG                               IMAGE ID       SIZE
docker-orchagent                                   202305_RC.55-16d7da84c_Internal   3f3b4e8d6a83   330MB
docker-orchagent                                   latest                            3f3b4e8d6a83   330MB
docker-fpm-frr                                     202305_RC.55-16d7da84c_Internal   28bf35dd59af   350MB
docker-fpm-frr                                     latest                            28bf35dd59af   350MB
docker-nat                                         202305_RC.55-16d7da84c_Internal   40fb6b1c2de0   321MB
docker-nat                                         latest                            40fb6b1c2de0   321MB
docker-sflow                                       202305_RC.55-16d7da84c_Internal   4aeff566c75d   320MB
docker-sflow                                       latest                            4aeff566c75d   320MB
docker-teamd                                       202305_RC.55-16d7da84c_Internal   f7a9e70960fb   318MB
docker-teamd                                       latest                            f7a9e70960fb   318MB
docker-macsec                                      latest                            17367fbba2ad   320MB
docker-syncd-mlnx                                  202305_RC.55-16d7da84c_Internal   ecffee80c40e   844MB
docker-syncd-mlnx                                  latest                            ecffee80c40e   844MB
docker-platform-monitor                            202305_RC.55-16d7da84c_Internal   64dd2ddb15e1   829MB
docker-platform-monitor                            latest                            64dd2ddb15e1   829MB
docker-dhcp-relay                                  latest                            c82052f8856f   308MB
docker-eventd                                      202305_RC.55-16d7da84c_Internal   08450f0634c6   300MB
docker-eventd                                      latest                            08450f0634c6   300MB
docker-sonic-telemetry                             202305_RC.55-16d7da84c_Internal   62f8c86af715   387MB
docker-sonic-telemetry                             latest                            62f8c86af715   387MB
docker-snmp                                        202305_RC.55-16d7da84c_Internal   2280ad37ce04   340MB
docker-snmp                                        latest                            2280ad37ce04   340MB
docker-lldp                                        202305_RC.55-16d7da84c_Internal   8f33b3da4f82   343MB
docker-lldp                                        latest                            8f33b3da4f82   343MB
docker-router-advertiser                           202305_RC.55-16d7da84c_Internal   c905506bff31   301MB
docker-router-advertiser                           latest                            c905506bff31   301MB
docker-mux                                         202305_RC.55-16d7da84c_Internal   8c92a2ffe0c3   349MB
docker-mux                                         latest                            8c92a2ffe0c3   349MB
docker-database                                    202305_RC.55-16d7da84c_Internal   bb154d317a72   301MB
docker-database                                    latest                            bb154d317a72   301MB
docker-sonic-mgmt-framework                        202305_RC.55-16d7da84c_Internal   634e58c34140   416MB
docker-sonic-mgmt-framework                        latest                            634e58c34140   416MB

Additional information you deem important (e.g. issue happens only occasionally):

Also, there were other warm boot related issues reported:

judyjoseph commented 9 months ago

@saiarcot895 please sync up with @stepanblyschak