sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
737 stars 1.43k forks source link

[202205][warm-boot][coppmgrd]: Orchagent crashed after policer configuration #11823

Closed nazariig closed 2 years ago

nazariig commented 2 years ago

Description

The issue is caused by a policer configuration with create only attribute https://github.com/opencomputeproject/SAI/blob/master/inc/saipolicer.h#L103

/**
 * @brief Policer Meter Type
 *
 * @type sai_meter_type_t
 * @flags MANDATORY_ON_CREATE | CREATE_ONLY
 */
SAI_POLICER_ATTR_METER_TYPE = SAI_POLICER_ATTR_START,

which caused orchagent crash

May 14 01:42:54.239780 sonic NOTICE swss#orchagent: :- setWarmStartState: orchagent warm start state changed to reconciled
May 14 01:42:54.292313 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] mlnx_sai_utils.c[1833]- set_dispatch_attrib_handler: Set QUEUE, key:Trap group 0, val:0
May 14 01:42:54.294803 sonic NOTICE swss#orchagent: :- processCoppRule: Set trap group default to host interface
May 14 01:42:54.295349 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] mlnx_sai_utils.c[1833]- set_dispatch_attrib_handler: Set CBS, key:policer key:0, val:600
May 14 01:42:54.296851 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] mlnx_sai_utils.c[1833]- set_dispatch_attrib_handler: Set CIR, key:policer key:0, val:600
May 14 01:42:54.297769 sonic ERR swss#orchagent: :- meta_generic_validation_set: SAI_POLICER_ATTR_METER_TYPE:SAI_ATTR_VALUE_TYPE_INT32 attr is create only and cannot be modified
May 14 01:42:54.297769 sonic ERR swss#orchagent: :- trapGroupUpdatePolicer: Failed to apply attribute[2].id=0 to policer for trap group:default, error:-5
May 14 01:42:54.297769 sonic ERR swss#orchagent: :- handleSaiSetStatus: Encountered failure in set operation, exiting orchagent, SAI API: SAI_API_POLICER, status: SAI_STATUS_INVALID_PARAMETER
May 14 01:42:54.762469 sonic INFO swss#supervisord 2022-05-14 01:42:54,761 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected)

From the logs we can see that policer double configuration takes place. Second time it fails because update operation is performed with keys which are supposed to be used for create only.

Details:

root@sonic:/home/admin# zgrep -a 'trapGroupUpdatePolicer\|Copp\|Policer\|trap group\|APPLY_VIEW' /var/log/syslog
May 14 01:42:37.380827 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish arp(ok) to state db
May 14 01:42:37.380827 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish bgp(ok) to state db
May 14 01:42:37.381007 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish dhcp_relay(ok) to state db
May 14 01:42:37.381155 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish ip2me(ok) to state db
May 14 01:42:37.381155 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish lacp(ok) to state db
May 14 01:42:37.381202 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish lldp(ok) to state db
May 14 01:42:37.381264 sonic NOTICE swss#coppmgrd: :- setCoppTrapStateOk: Publish udld(ok) to state db
May 14 01:42:37.381508 sonic NOTICE swss#coppmgrd: :- setCoppGroupStateOk: Publish default(ok) to state db
May 14 01:42:37.381693 sonic NOTICE swss#coppmgrd: :- setCoppGroupStateOk: Publish queue1_group1(ok) to state db
May 14 01:42:37.381854 sonic NOTICE swss#coppmgrd: :- setCoppGroupStateOk: Publish queue4_group1(ok) to state db
May 14 01:42:37.382013 sonic NOTICE swss#coppmgrd: :- setCoppGroupStateOk: Publish queue4_group2(ok) to state db
May 14 01:42:37.382157 sonic NOTICE swss#coppmgrd: :- setCoppGroupStateOk: Publish queue4_group3(ok) to state db
May 14 01:42:40.235310 sonic INFO swss#orchagent: :- initDefaultTrapGroup: Get default trap group
May 14 01:42:42.136959 sonic NOTICE swss#orchagent: :- processCoppRule: Set trap group default to host interface
May 14 01:42:42.136959 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000003 (name:default).
May 14 01:42:42.137697 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group default
May 14 01:42:42.139375 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group default:
May 14 01:42:42.139647 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=1
May 14 01:42:42.139908 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 1
May 14 01:42:42.140328 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue1_group1
May 14 01:42:42.140328 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000d00 (name:queue1_group1).
May 14 01:42:42.141028 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group queue1_group1
May 14 01:42:42.142286 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group queue1_group1:
May 14 01:42:42.143662 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
May 14 01:42:42.143871 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 2
May 14 01:42:42.144187 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group1
May 14 01:42:42.147654 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
May 14 01:42:42.147921 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 3
May 14 01:42:42.148200 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group2
May 14 01:42:42.148200 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000d07 (name:queue4_group2).
May 14 01:42:42.149082 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group queue4_group2
May 14 01:42:42.150100 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group queue4_group2:
May 14 01:42:42.153972 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
May 14 01:42:42.154185 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 4
May 14 01:42:42.154460 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group3
May 14 01:42:44.708714 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW
May 14 01:42:44.708714 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW
May 14 01:42:44.964963 sonic NOTICE syncd#SDK: :- processNotifySyncd: very first run is TRUE, op = APPLY_VIEW
May 14 01:42:45.395831 sonic NOTICE syncd#SDK: :- threadFunction: time span 430 ms for 'notify:APPLY_VIEW'
May 14 01:42:46.395942 sonic NOTICE syncd#SDK: :- threadFunction: time span 1430 ms for 'notify:APPLY_VIEW'
May 14 01:42:47.396107 sonic NOTICE syncd#SDK: :- threadFunction: time span 2431 ms for 'notify:APPLY_VIEW'
May 14 01:42:48.396215 sonic NOTICE syncd#SDK: :- threadFunction: time span 3431 ms for 'notify:APPLY_VIEW'
May 14 01:42:49.396336 sonic NOTICE syncd#SDK: :- threadFunction: time span 4431 ms for 'notify:APPLY_VIEW'
May 14 01:42:50.396486 sonic NOTICE syncd#SDK: :- threadFunction: time span 5431 ms for 'notify:APPLY_VIEW'
May 14 01:42:51.396619 sonic NOTICE syncd#SDK: :- threadFunction: time span 6431 ms for 'notify:APPLY_VIEW'
May 14 01:42:52.396722 sonic NOTICE syncd#SDK: :- threadFunction: time span 7431 ms for 'notify:APPLY_VIEW'
May 14 01:42:53.217153 sonic NOTICE syncd#SDK: :- processNotifySyncd: setting very first run to FALSE, op = APPLY_VIEW
May 14 01:42:54.294803 sonic NOTICE swss#orchagent: :- processCoppRule: Set trap group default to host interface
May 14 01:42:54.297769 sonic ERR swss#orchagent: :- trapGroupUpdatePolicer: Failed to apply attribute[2].id=0 to policer for trap group:default, error:-5
May 14 01:43:40.861974 sonic INFO swss#orchagent: :- initDefaultTrapGroup: Get default trap group
May 14 01:43:40.931765 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW
May 14 01:43:40.931765 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW
May 14 01:43:40.933737 sonic ERR swss#orchagent: :- syncd_apply_view: Failed to notify syncd APPLY_VIEW -1

SONiC behaviour after warm reboot when coppmgrd is disabled:

  1. Valid DB version (version_3_0_5):

    root@sonic:/home/admin# zgrep -a 'trapGroupUpdatePolicer\|Copp\|Policer\|trap group\|APPLY_VIEW' /var/log/syslog
    May 14 03:40:58.885877 sonic INFO swss#orchagent: :- initDefaultTrapGroup: Get default trap group
    May 14 03:41:02.639813 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW
    May 14 03:41:02.639813 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW
    May 14 03:41:02.643691 sonic NOTICE syncd#SDK: :- processNotifySyncd: very first run is TRUE, op = APPLY_VIEW
    May 14 03:41:03.153945 sonic NOTICE syncd#SDK: :- threadFunction: time span 510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:04.154060 sonic NOTICE syncd#SDK: :- threadFunction: time span 1510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:05.154167 sonic NOTICE syncd#SDK: :- threadFunction: time span 2510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:06.154331 sonic NOTICE syncd#SDK: :- threadFunction: time span 3510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:07.154448 sonic NOTICE syncd#SDK: :- threadFunction: time span 4510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:08.154613 sonic NOTICE syncd#SDK: :- threadFunction: time span 5510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:09.154727 sonic NOTICE syncd#SDK: :- threadFunction: time span 6510 ms for 'notify:APPLY_VIEW'
    May 14 03:41:10.154862 sonic NOTICE syncd#SDK: :- threadFunction: time span 7511 ms for 'notify:APPLY_VIEW'
    May 14 03:41:10.346559 sonic NOTICE syncd#SDK: :- processNotifySyncd: setting very first run to FALSE, op = APPLY_VIEW
  2. Invalid DB version (version_2_0_4):

    root@sonic:/home/admin# zgrep -a 'trapGroupUpdatePolicer\|Copp\|Policer\|trap group\|APPLY_VIEW' /var/log/syslog
    May 14 03:52:18.075310 sonic INFO swss#orchagent: :- initDefaultTrapGroup: Get default trap group
    May 14 03:52:19.888716 sonic NOTICE swss#orchagent: :- processCoppRule: Set trap group default to host interface
    May 14 03:52:19.888716 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000003 (name:default).
    May 14 03:52:19.889413 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group default
    May 14 03:52:19.890973 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group default:
    May 14 03:52:19.891274 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=1
    May 14 03:52:19.891490 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 1
    May 14 03:52:19.891774 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue1_group1
    May 14 03:52:19.891785 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000d00 (name:queue1_group1).
    May 14 03:52:19.892540 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group queue1_group1
    May 14 03:52:19.893659 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group queue1_group1:
    May 14 03:52:19.895142 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
    May 14 03:52:19.895450 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 2
    May 14 03:52:19.895734 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group1
    May 14 03:52:19.899267 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
    May 14 03:52:19.899518 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 3
    May 14 03:52:19.899857 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group2
    May 14 03:52:19.899870 sonic WARNING swss#orchagent: :- trapGroupUpdatePolicer: Creating policer for existing Trap group: 11000000000d07 (name:queue4_group2).
    May 14 03:52:19.900559 sonic NOTICE swss#orchagent: :- createPolicer: Create policer for trap group queue4_group2
    May 14 03:52:19.901713 sonic NOTICE swss#orchagent: :- createPolicer: Bind policer to trap group queue4_group2:
    May 14 03:52:19.905812 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2291]- mlnx_create_hostif_trap_group: Create trap group, #0 QUEUE=4
    May 14 03:52:19.906062 sonic NOTICE syncd#SDK: [SAI_HOST_INTERFACE.NOTICE] mlnx_sai_host_interface.c[2338]- mlnx_create_hostif_trap_group: Created trap group Trap group 4
    May 14 03:52:19.906355 sonic NOTICE swss#orchagent: :- processCoppRule: Create host interface trap group queue4_group3
    May 14 03:52:22.196450 sonic NOTICE swss#orchagent: :- syncd_apply_view: Notify syncd APPLY_VIEW
    May 14 03:52:22.196472 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: APPLY_VIEW
    May 14 03:52:22.479931 sonic NOTICE syncd#SDK: :- processNotifySyncd: very first run is TRUE, op = APPLY_VIEW
    May 14 03:52:23.178438 sonic NOTICE syncd#SDK: :- threadFunction: time span 698 ms for 'notify:APPLY_VIEW'
    May 14 03:52:24.178609 sonic NOTICE syncd#SDK: :- threadFunction: time span 1698 ms for 'notify:APPLY_VIEW'
    May 14 03:52:25.178749 sonic NOTICE syncd#SDK: :- threadFunction: time span 2698 ms for 'notify:APPLY_VIEW'
    May 14 03:52:26.178856 sonic NOTICE syncd#SDK: :- threadFunction: time span 3698 ms for 'notify:APPLY_VIEW'
    May 14 03:52:27.178987 sonic NOTICE syncd#SDK: :- threadFunction: time span 4698 ms for 'notify:APPLY_VIEW'
    May 14 03:52:28.179150 sonic NOTICE syncd#SDK: :- threadFunction: time span 5699 ms for 'notify:APPLY_VIEW'
    May 14 03:52:29.179263 sonic NOTICE syncd#SDK: :- threadFunction: time span 6699 ms for 'notify:APPLY_VIEW'
    May 14 03:52:30.179359 sonic NOTICE syncd#SDK: :- threadFunction: time span 7699 ms for 'notify:APPLY_VIEW'
    May 14 03:52:30.696185 sonic NOTICE syncd#SDK: :- processNotifySyncd: setting very first run to FALSE, op = APPLY_VIEW

The issue seems to be caused by a very strange side effect of having invalid DB version

Simple steps to reproduce:

  1. Install 202205 image
    root@sonic:/home/admin# sonic-installer list
    Current: SONiC-OS-202205.21-5c306cc2e_Internal
    Next: SONiC-OS-202205.21-5c306cc2e_Internal
    Available:
    SONiC-OS-202205.21-5c306cc2e_Internal
    SONiC-OS-202111.93-003c8dfde_Internal
  2. Set invalid DB version
    root@sonic:/home/admin# redis-cli -n 4 HGETALL "VERSIONS|DATABASE"
    1) "VERSION"
    2) "version_3_0_5"
    root@sonic:/home/admin# redis-cli -n 4 HSET "VERSIONS|DATABASE" "VERSION" "version_2_0_4"
    (integer) 0
    root@sonic:/home/admin# redis-cli -n 4 HGETALL "VERSIONS|DATABASE"
    1) "VERSION"
    2) "version_2_0_4"
  3. Run warm-reboot
    root@sonic:/home/admin# warm-reboot -v
    Sat 14 May 2022 01:41:34 AM UTC Prepare MLNX ASIC to fastfast-reboot: install new FW if required
    Sat 14 May 2022 01:41:35 AM UTC Pausing orchagent ...
    Sat 14 May 2022 01:41:35 AM UTC Collecting logs to check ssd health before fastfast-reboot...
    Sat 14 May 2022 01:41:35 AM UTC Stopping lldp.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopped lldp.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopping mgmt-framework.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopped mgmt-framework.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopping pmon.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopped pmon.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopping snmp.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopped snmp.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopping telemetry.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopped telemetry.timer ...
    Sat 14 May 2022 01:41:35 AM UTC Stopping lldp ...
    Sat 14 May 2022 01:41:36 AM UTC Stopped lldp
    Sat 14 May 2022 01:41:36 AM UTC Stopping mux ...
    Warning: The unit file, source configuration file or drop-ins of mux.service changed on disk. Run 'systemctl daemon-reload' to reload units.
    Sat 14 May 2022 01:41:36 AM UTC Stopped mux
    Sat 14 May 2022 01:41:36 AM UTC Stopping nat ...
    Dumping conntrack entries failed
    Warning: The unit file, source configuration file or drop-ins of nat.service changed on disk. Run 'systemctl daemon-reload' to reload units.
    Sat 14 May 2022 01:41:36 AM UTC Stopped nat
    Sat 14 May 2022 01:41:36 AM UTC Stopping pmon ...
    Sat 14 May 2022 01:41:39 AM UTC Stopped pmon
    Sat 14 May 2022 01:41:39 AM UTC Stopping radv ...
    Sat 14 May 2022 01:41:39 AM UTC Stopped radv
    Sat 14 May 2022 01:41:39 AM UTC Stopping sflow ...
    Warning: The unit file, source configuration file or drop-ins of sflow.service changed on disk. Run 'systemctl daemon-reload' to reload units.
    Sat 14 May 2022 01:41:40 AM UTC Stopped sflow
    Sat 14 May 2022 01:41:40 AM UTC Stopping bgp ...
    Sat 14 May 2022 01:41:44 AM UTC Stopped bgp
    Sat 14 May 2022 01:41:44 AM UTC Stopping swss ...
    Sat 14 May 2022 01:41:53 AM UTC Stopped swss
    Sat 14 May 2022 01:41:53 AM UTC Initialize pre-shutdown ...
    Sat 14 May 2022 01:41:53 AM UTC Requesting pre-shutdown ...
    Sat 14 May 2022 01:41:54 AM UTC Waiting for pre-shutdown ...
    Sat 14 May 2022 01:42:00 AM UTC Pre-shutdown succeeded, state: pre-shutdown-succeeded ...
    Sat 14 May 2022 01:42:00 AM UTC Backing up database ...
    Sat 14 May 2022 01:42:01 AM UTC Stopping teamd ...
    Sat 14 May 2022 01:42:01 AM UTC Stopped teamd
    Sat 14 May 2022 01:42:01 AM UTC Stopping syncd ...
    Sat 14 May 2022 01:42:04 AM UTC Stopped syncd
    Sat 14 May 2022 01:42:04 AM UTC Stopping all remaining containers ...
    Sat 14 May 2022 01:42:06 AM UTC Stopped all remaining containers ...
    Warning: Stopping docker.service, but it can still be activated by:
    docker.socket
    Sat 14 May 2022 01:42:08 AM UTC Enabling Watchdog before fastfast-reboot
    Watchdog armed for 180 seconds
    Sat 14 May 2022 01:42:08 AM UTC Rebooting with /sbin/kexec -e to SONiC-OS-202205.21-5c306cc2e_Internal ...

Steps to reproduce the issue:

  1. Install 202111 image
  2. Make sure config_db.json doesn't have version defined:
    cat /etc/sonic/config_db.json | jq .VERSIONS
  3. Reload configuration
    config reload -y
  4. Install 202205 image
  5. Run warm-reboot:
    warm-reboot -v

Describe the results you received:

warm-boot is failed: orchagent crashed after policer configuration

Describe the results you expected:

warm-boot is successful: no orchagent crash is observed

Output of show version:

SONiC Software Version: SONiC.202205.21-5c306cc2e_Internal
Distribution: Debian 11.4
Kernel: 5.10.0-12-2-amd64
Build commit: 5c306cc2e
Build date: Tue Aug 16 17:28:01 UTC 2022
Built by: sw-r2d2-bot@r-build-sonic-ci02-244

Platform: x86_64-mlnx_msn4600c-r0
HwSKU: ACS-MSN4600C
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2053X21259
Model Number: MSN4600-CS2FO
Hardware Revision: A1
Uptime: 04:17:25 up 25 min,  4 users,  load average: 0.61, 0.64, 0.56
Date: Sat 14 May 2022 04:17:25

Docker images:
REPOSITORY                                         TAG                            IMAGE ID       SIZE
docker-orchagent                                   202205.21-5c306cc2e_Internal   7b830bf4ac69   471MB
docker-orchagent                                   latest                         7b830bf4ac69   471MB
docker-teamd                                       202205.21-5c306cc2e_Internal   720cf72527fc   453MB
docker-teamd                                       latest                         720cf72527fc   453MB
docker-macsec                                      latest                         959fe4dc45af   455MB
docker-syncd-mlnx                                  202205.21-5c306cc2e_Internal   cf601dc58937   852MB
docker-syncd-mlnx                                  latest                         cf601dc58937   852MB
docker-platform-monitor                            202205.21-5c306cc2e_Internal   bd92acc74a9b   855MB
docker-platform-monitor                            latest                         bd92acc74a9b   855MB
docker-dhcp-relay                                  latest                         a1354d11d617   446MB
docker-sonic-telemetry                             202205.21-5c306cc2e_Internal   c02a9b7c90f3   517MB
docker-sonic-telemetry                             latest                         c02a9b7c90f3   517MB
docker-lldp                                        202205.21-5c306cc2e_Internal   489006bb88af   479MB
docker-lldp                                        latest                         489006bb88af   479MB
docker-router-advertiser                           202205.21-5c306cc2e_Internal   66ce07fe6902   437MB
docker-router-advertiser                           latest                         66ce07fe6902   437MB
docker-mux                                         202205.21-5c306cc2e_Internal   452a17f01c75   485MB
docker-mux                                         latest                         452a17f01c75   485MB
docker-database                                    202205.21-5c306cc2e_Internal   9c904fb2b204   437MB
docker-database                                    latest                         9c904fb2b204   437MB
docker-fpm-frr                                     202205.21-5c306cc2e_Internal   ee1e4edb0cc2   454MB
docker-fpm-frr                                     latest                         ee1e4edb0cc2   454MB
docker-nat                                         202205.21-5c306cc2e_Internal   c13577bb50e4   428MB
docker-nat                                         latest                         c13577bb50e4   428MB
docker-snmp                                        202205.21-5c306cc2e_Internal   62f8db365873   454MB
docker-snmp                                        latest                         62f8db365873   454MB
docker-sflow                                       202205.21-5c306cc2e_Internal   a999881d642f   426MB
docker-sflow                                       latest                         a999881d642f   426MB
docker-sonic-mgmt-framework                        202205.21-5c306cc2e_Internal   0732de8d491e   554MB
docker-sonic-mgmt-framework                        latest                         0732de8d491e   554MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

zhangyanzhao commented 2 years ago

known limitation, will update the doc to highlight this known issue.

zhangyanzhao commented 2 years ago

https://github.com/sonic-net/sonic-buildimage/issues/11824