sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
724 stars 1.38k forks source link

warm-reboot throws many errors before going for a reboot #6840

Open nazeerhussainf opened 3 years ago

nazeerhussainf commented 3 years ago

Description

Warm-reboot on the latest master throws many errors before going for a reboot. Attached the show techsupport and syslog from the problem state

root@sonic-z9332-10429:~# warm-reboot Dumping conntrack entries failed Error response from daemon: Cannot kill container: nat: No such container: nat An exception of type AttributeError occurred. Arguments: ("type object 'SonicDBConfig' has no attribute 'get_port'",) Error: No such container:path: database:/var/lib//dump.rdb rm: cannot remove '/var/lib//dump.rdb': No such file or directory Warning: Stopping mgmt-framework.service, but it can still be activated by: mgmt-framework.timer Warning: Stopping telemetry.service, but it can still be activated by: telemetry.timer Failed to arm Watchdog for 180 seconds [ 381.116005] kexec_core: Starting new kernel [ 0.315665] Base address is zero, assuming no IPMI interface [ 4.092828] rc.local[474]: + sed -e s/build_version: //g;s/'//g [ 4.170029] rc.local[474]: + grep build_version [ 4.233134] rc.local[474]: + cat /etc/sonic/sonic_version.yml [ 4.312772] rc.local[474]: + SONIC_VERSION=master.585-f6bee730 [ 4.387485] rc.local[474]: + FIRST_BOOT_FILE=/host/image-master.585-f6bee730/platform/firsttime [ 4.495503] rc.local[474]: + SONIC_CONFIG_DIR=/host/image-master.585-f6bee730/sonic-config [ 4.595410] rc.local[474]: + SONIC_ENV_FILE=/host/image-master.585-f6bee730/sonic-config/sonic-environment [ 4.719427] rc.local[474]: + [ -d /host/image-master.585-f6bee730/sonic-config -a -f /host/image-master.585-f6bee730/sonic-config/sonic-environment ] [ 4.883437] rc.local[474]: + logger SONiC version master.585-f6bee730 starting up... [ 4.987437] rc.local[474]: + grub_installation_needed= [ 5.059428] rc.local[474]: + [ ! -e /host/machine.conf ] [ 5.123978] rc.local[474]: + migrate_nos_configuration [ 5.195558] rc.local[474]: + rm -rf /host/migration [ 5.255390] rc.local[474]: + mkdir -p /host/migration [ 5.358975] kdump-tools[466]: /etc/init.d/kdump-tools: 117: /etc/default/kdump-tools: KDUMP_CMDLINE_APPEND+= panic=10 debug hpet=disable pcie_port=compat pci=nommconf sonic_platform=x86_64-dellemc_z9332f_d1508-r0: not found [ 5.604038] rc.local[474]: + cat /proc/cmdline [ 5.674210] rc.local[474]: + set -- BOOT_IMAGE=/image-master.585-f6bee730/boot/vmlinuz-4.19.0-12-2-amd64 root=UUID=f7f6d384-e3b7-497c-980b-2bf36d91cbd0 rw console=tty0 console=ttyS0,9600n8 quiet net.ifnames=0 biosdevname=0 loop=image-master.585-f6bee730/fs.squashfs loopfstype=squashfs apparmor=1 security=apparmor varlog_size=4096 usbcore.autosuspend=-1 SONIC_BOOT_TYPE=warm [ 6.079571] rc.local[474]: + [ -n ] [ 6.124074] rc.local[474]: + . /host/machine.conf [ 6.183582] rc.local[474]: + onie_arch=x86_64 [ 6.243414] rc.local[474]: + onie_bin= [ 6.299465] rc.local[474]: + onie_boot_reason=install [ 6.363534] rc.local[474]: + onie_build_date=2020-12-14T22:56+08:00 [ 6.439481] rc.local[474]: + onie_build_machine=dellemc_z9332f_d1508 [ 6.527450] rc.local[474]: + onie_build_platform=x86_64-dellemc_z9332f_d1508-r0 [ 6.619389] rc.local[474]: + onie_cli_static_parms= [ 6.679419] rc.local[474]: + onie_cli_static_url=sonic-broadcom.bin [ 6.755459] rc.local[474]: + onie_config_version=1 [ 6.815436] rc.local[474]: + onie_dev=/dev/sda2 [ 6.875456] rc.local[474]: + onie_exec_url=sonic-broadcom.bin [ 6.951417] rc.local[474]: + onie_firmware=auto [ 7.011450] rc.local[474]: + onie_grub_image_name=grubx64.efi [ 7.087443] rc.local[474]: + onie_initrd_tmp=/ [ 7.148042] rc.local[474]: + onie_installer=/var/tmp/installer [ 7.223503] rc.local[474]: + onie_kernel_version=4.9.95 [ 7.295666] rc.local[474]: + onie_machine=dellemc_z9332f_d1508 [ 7.371628] rc.local[474]: + onie_machine_rev=0 [ 7.431617] rc.local[474]: + onie_partition_type=gpt [ 7.491626] rc.local[474]: + onie_platform=x86_64-dellemc_z9332f_d1508-r0 [ 7.579626] rc.local[474]: + onie_root_dir=/mnt/onie-boot/onie [ 7.655647] rc.local[474]: + onie_skip_ethmgmt_macs=no [ 7.727638] rc.local[474]: + onie_switch_asic=bcm [ 7.787624] rc.local[474]: + onie_uefi_arch=x64 [ 7.847625] rc.local[474]: + onie_uefi_boot_loader=grubx64.efi [ 7.923600] rc.local[474]: + onie_vendor_id=12244 [ 7.983640] rc.local[474]: + onie_version=2020.11.06.0.0.4 [ 8.055627] rc.local[474]: + program_console_speed [ 8.125397] rc.local[474]: + cat /proc/cmdline [ 8.193403] rc.local[474]: + grep -Eo console=ttyS[0-9]+,[0-9]+ [ 8.272780] rc.local[474]: + cut -d , -f2 [ 8.329377] rc.local[474]: + speed=9600 [ 8.383694] rc.local[474]: + [ -z 9600 ] [ 8.439419] rc.local[474]: + CONSOLE_SPEED=9600 [ 8.499516] rc.local[474]: + sed -i s|--keep-baud .* %I| 9600 %I|g /lib/systemd/system/serial-getty@.service [ 8.627458] rc.local[474]: + systemctl daemon-reload [ 8.687653] rc.local[474]: + [ -f /host/image-master.585-f6bee730/platform/firsttime ] [ 8.783463] rc.local[474]: + [ -f /var/log/fsck.log.gz ] [ 8.848713] rc.local[474]: + gunzip -d -c /var/log/fsck.log.gz [ 8.928938] rc.local[474]: + logger -t FSCK [ 8.984514] rc.local[474]: + rm -f /var/log/fsck.log.gz [ 9.059651] rc.local[474]: + exit 0

Debian GNU/Linux 10 sonic-z9332-10429 ttyS0

sonic-z9332-10429 login: root Password: Last login: Mon Feb 22 06:33:45 UTC 2021 on ttyS0 Linux sonic-z9332-10429 4.19.0-12-2-amd64 #1 SMP Debian 4.19.152-1 (2020-10-18) x86_64 You are on


/ | / | \ | ()/ | _ | | | | | | | | _) | || | |\ | | | |___/ \/|| _||____|

-- Software for Open Networking in the Cloud --

Unauthorized access and/or use are prohibited. All access and/or use are subject to monitoring.

Help: http://azure.github.io/SONiC/

root@sonic-z9332-10429:~#

Steps to reproduce the issue:

  1. Load the latest master image(build 585)
  2. Execute the command warm-reboot.
  3. Throws many errors before going for a reboot.

Describe the results you received:

Describe the results you expected:

Output of show version:

root@sonic-z9332-10429:~# show version

SONiC Software Version: SONiC.master.585-f6bee730
Distribution: Debian 10.8
Kernel: 4.19.0-12-2-amd64
Build commit: f6bee730
Build date: Tue Feb 16 06:47:21 UTC 2021
Built by: johnar@jenkins-worker-8

Platform: x86_64-dellemc_z9332f_d1508-r0
HwSKU: DellEMC-Z9332f-M-O16C64
ASIC: broadcom
ASIC Count: 1
Serial Number: TH04CN21CET009BR0023
Uptime: 07:16:03 up 41 min,  2 users,  load average: 2.38, 2.13, 2.00

Docker images:
REPOSITORY                    TAG                   IMAGE ID            SIZE
docker-syncd-brcm             latest                799f53f118bc        679MB
docker-syncd-brcm             master.585-f6bee730   799f53f118bc        679MB
docker-snmp                   latest                eea7e1051de8        438MB
docker-snmp                   master.585-f6bee730   eea7e1051de8        438MB
docker-teamd                  latest                1cdcf3651761        407MB
docker-teamd                  master.585-f6bee730   1cdcf3651761        407MB
docker-router-advertiser      latest                d715fb030745        397MB
docker-router-advertiser      master.585-f6bee730   d715fb030745        397MB
docker-platform-monitor       latest                cc69f7f73912        605MB
docker-platform-monitor       master.585-f6bee730   cc69f7f73912        605MB
docker-macsec                 latest                24aeb8e4c910        411MB
docker-macsec                 master.585-f6bee730   24aeb8e4c910        411MB
docker-lldp                   latest                4ab7a4b1aee9        437MB
docker-lldp                   master.585-f6bee730   4ab7a4b1aee9        437MB
docker-dhcp-relay             latest                fe4ea21f347b        404MB
docker-dhcp-relay             master.585-f6bee730   fe4ea21f347b        404MB
docker-database               latest                55627c613648        397MB
docker-database               master.585-f6bee730   55627c613648        397MB
docker-sonic-mgmt-framework   latest                d4e7daf27a14        615MB
docker-sonic-mgmt-framework   master.585-f6bee730   d4e7daf27a14        615MB
docker-orchagent              latest                22fc1e0b3a62        426MB
docker-orchagent              master.585-f6bee730   22fc1e0b3a62        426MB
docker-nat                    latest                3812c1c21b6b        410MB
docker-nat                    master.585-f6bee730   3812c1c21b6b        410MB
docker-sonic-telemetry        latest                6d69122c5fc3        471MB
docker-sonic-telemetry        master.585-f6bee730   6d69122c5fc3        471MB
docker-fpm-frr                latest                11769c6a5ffc        426MB
docker-fpm-frr                master.585-f6bee730   11769c6a5ffc        426MB
docker-sflow                  latest                a165aece361e        408MB
docker-sflow                  master.585-f6bee730   a165aece361e        408MB

root@sonic-z9332-10429:~#

Additional information you deem important (e.g. issue happens only occasionally):

show techsupport : sonic_dump_sonic-z9332-10429_20210222_070441.tar.gz

Syslog: syslog.txt

vaibhavhd commented 3 years ago

This is a duplicate of https://github.com/Azure/sonic-buildimage/issues/6811 The fix is added in sonic-utilities, and submodule is now being updated.

anshuv-mfst commented 3 years ago

@vaibhavhd - could you please link the PR to this issue, thanks.

vaibhavhd commented 3 years ago

Fix in sonic-utilities: https://github.com/Azure/sonic-utilities/pull/1441 Submodule update: https://github.com/Azure/sonic-buildimage/pull/6831