sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
741 stars 1.43k forks source link

telemetry docker Exited after virtual switch start #13172

Open ljyfree opened 1 year ago

ljyfree commented 1 year ago

Description

Steps to reproduce the issue:

1.download sonic-vs.img 2.start kvm 3.start telemetry docker

admin@sonic:~$ telemetry.sh start
Starting existing telemetry container with HWSKU Force10-S6000
admin@sonic:~$ 
  1. Before long , docker telemetry Exited

Describe the results you received:

admin@sonic:~$ docker ps -a | grep tele
9b1acee2a1d3   docker-sonic-telemetry:latest        "/usr/local/bin/supe…"   13 minutes ago   Exited (0) 5 minutes ago             telemetry
admin@sonic:~$ 
admin@sonic:~$ docker logs telemetry 
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(
2022-12-26 01:22:11,020 INFO Included extra file "/etc/supervisor/conf.d/containercfgd.conf" during parsing
2022-12-26 01:22:11,020 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2022-12-26 01:22:11,021 INFO Set uid to user 0 succeeded
2022-12-26 01:22:11,049 INFO RPC interface 'supervisor' initialized
2022-12-26 01:22:11,049 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2022-12-26 01:22:11,050 INFO supervisord started with pid 1
2022-12-26 01:22:12,063 INFO spawned: 'dependent-startup' with pid 7
2022-12-26 01:22:12,079 INFO spawned: 'supervisor-proc-exit-listener' with pid 8
2022-12-26 01:22:14,000 INFO success: dependent-startup entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:14,002 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:14,046 INFO spawned: 'rsyslogd' with pid 11
2022-12-26 01:22:15,120 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:16,221 INFO spawned: 'start' with pid 15
2022-12-26 01:22:16,227 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:22:16,261 INFO spawned: 'containercfgd' with pid 16
2022-12-26 01:22:16,269 INFO exited: start (exit status 0; expected)
2022-12-26 01:22:16,297 INFO spawned: 'start' with pid 18
2022-12-26 01:22:16,299 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:22:16,338 INFO exited: start (exit status 0; expected)
2022-12-26 01:22:16,418 INFO spawned: 'telemetry' with pid 20
2022-12-26 01:22:17,567 INFO success: telemetry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:17,571 INFO success: containercfgd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:18,622 INFO spawned: 'dialout' with pid 22
2022-12-26 01:22:19,713 INFO success: dialout entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:19,907 INFO exited: dependent-startup (exit status 0; expected)
2022-12-26 01:22:20,273 INFO exited: telemetry (exit status 1; not expected)
2022-12-26 01:22:21,298 WARN received SIGTERM indicating exit request
2022-12-26 01:22:21,299 INFO waiting for supervisor-proc-exit-listener, rsyslogd, dialout, containercfgd to die
2022-12-26 01:22:21,328 INFO stopped: containercfgd (exit status 143)
2022-12-26 01:22:21,333 INFO stopped: dialout (terminated by SIGTERM)
2022-12-26 01:22:23,365 INFO stopped: rsyslogd (exit status 0)
2022-12-26 01:22:23,367 INFO stopped: supervisor-proc-exit-listener (terminated by SIGTERM)
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(
2022-12-26 01:22:55,297 INFO Included extra file "/etc/supervisor/conf.d/containercfgd.conf" during parsing
2022-12-26 01:22:55,297 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2022-12-26 01:22:55,297 INFO Set uid to user 0 succeeded
2022-12-26 01:22:55,302 INFO RPC interface 'supervisor' initialized
2022-12-26 01:22:55,302 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2022-12-26 01:22:55,302 INFO supervisord started with pid 1
2022-12-26 01:22:56,307 INFO spawned: 'dependent-startup' with pid 7
2022-12-26 01:22:56,310 INFO spawned: 'supervisor-proc-exit-listener' with pid 8
2022-12-26 01:22:57,583 INFO success: dependent-startup entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:57,584 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:57,595 INFO spawned: 'rsyslogd' with pid 11
2022-12-26 01:22:58,645 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:22:59,661 INFO spawned: 'start' with pid 15
2022-12-26 01:22:59,664 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:22:59,676 INFO spawned: 'containercfgd' with pid 16
2022-12-26 01:22:59,685 INFO exited: start (exit status 0; expected)
2022-12-26 01:22:59,698 INFO spawned: 'start' with pid 18
2022-12-26 01:22:59,701 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:22:59,718 INFO exited: start (exit status 0; expected)
2022-12-26 01:22:59,744 INFO spawned: 'telemetry' with pid 20
2022-12-26 01:23:00,725 INFO success: containercfgd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:00,743 INFO success: telemetry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:00,776 INFO spawned: 'dialout' with pid 58
2022-12-26 01:23:01,138 INFO exited: telemetry (exit status 1; not expected)
2022-12-26 01:23:01,155 WARN received SIGTERM indicating exit request
2022-12-26 01:23:01,156 INFO waiting for dependent-startup, supervisor-proc-exit-listener, rsyslogd, dialout, containercfgd to die
2022-12-26 01:23:01,169 INFO stopped: containercfgd (exit status 143)
2022-12-26 01:23:01,173 INFO stopped: dialout (terminated by SIGTERM)
2022-12-26 01:23:01,183 INFO stopped: rsyslogd (exit status 0)
2022-12-26 01:23:01,186 INFO stopped: supervisor-proc-exit-listener (terminated by SIGTERM)
2022-12-26 01:23:01,189 INFO stopped: dependent-startup (terminated by SIGTERM)
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(
2022-12-26 01:23:32,792 INFO Included extra file "/etc/supervisor/conf.d/containercfgd.conf" during parsing
2022-12-26 01:23:32,793 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2022-12-26 01:23:32,793 INFO Set uid to user 0 succeeded
2022-12-26 01:23:32,797 INFO RPC interface 'supervisor' initialized
2022-12-26 01:23:32,797 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2022-12-26 01:23:32,797 INFO supervisord started with pid 1
2022-12-26 01:23:33,801 INFO spawned: 'dependent-startup' with pid 8
2022-12-26 01:23:33,804 INFO spawned: 'supervisor-proc-exit-listener' with pid 9
2022-12-26 01:23:35,053 INFO success: dependent-startup entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:35,053 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:35,061 INFO spawned: 'rsyslogd' with pid 12
2022-12-26 01:23:36,108 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:37,123 INFO spawned: 'start' with pid 16
2022-12-26 01:23:37,125 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:23:37,133 INFO spawned: 'containercfgd' with pid 17
2022-12-26 01:23:37,140 INFO exited: start (exit status 0; expected)
2022-12-26 01:23:37,152 INFO spawned: 'start' with pid 19
2022-12-26 01:23:37,154 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:23:37,167 INFO exited: start (exit status 0; expected)
2022-12-26 01:23:37,196 INFO spawned: 'telemetry' with pid 21
2022-12-26 01:23:38,135 INFO success: containercfgd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:38,195 INFO success: telemetry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:23:38,211 INFO exited: telemetry (exit status 1; not expected)
2022-12-26 01:23:38,217 WARN received SIGTERM indicating exit request
2022-12-26 01:23:38,218 INFO waiting for dependent-startup, supervisor-proc-exit-listener, rsyslogd, containercfgd to die
2022-12-26 01:23:38,230 INFO stopped: containercfgd (exit status 143)
2022-12-26 01:23:38,243 INFO exited: dependent-startup (exit status 3; expected)
2022-12-26 01:23:39,263 INFO stopped: rsyslogd (exit status 0)
2022-12-26 01:23:39,266 INFO stopped: supervisor-proc-exit-listener (terminated by SIGTERM)
/usr/local/lib/python3.9/dist-packages/supervisor/options.py:473: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
  self.warnings.warn(
2022-12-26 01:30:04,765 INFO Included extra file "/etc/supervisor/conf.d/containercfgd.conf" during parsing
2022-12-26 01:30:04,766 INFO Included extra file "/etc/supervisor/conf.d/supervisord.conf" during parsing
2022-12-26 01:30:04,767 INFO Set uid to user 0 succeeded
2022-12-26 01:30:04,772 INFO RPC interface 'supervisor' initialized
2022-12-26 01:30:04,773 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2022-12-26 01:30:04,774 INFO supervisord started with pid 1
2022-12-26 01:30:05,778 INFO spawned: 'dependent-startup' with pid 8
2022-12-26 01:30:05,781 INFO spawned: 'supervisor-proc-exit-listener' with pid 9
2022-12-26 01:30:06,996 INFO success: dependent-startup entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:30:06,997 INFO success: supervisor-proc-exit-listener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:30:07,004 INFO spawned: 'rsyslogd' with pid 12
2022-12-26 01:30:08,054 INFO success: rsyslogd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:30:09,071 INFO spawned: 'start' with pid 16
2022-12-26 01:30:09,072 INFO success: start entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2022-12-26 01:30:09,082 INFO spawned: 'containercfgd' with pid 17
2022-12-26 01:30:09,094 INFO exited: start (exit status 0; expected)
2022-12-26 01:30:09,149 INFO spawned: 'telemetry' with pid 19
2022-12-26 01:30:10,119 INFO success: containercfgd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:30:10,151 INFO success: telemetry entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-12-26 01:30:10,523 INFO spawned: 'dialout' with pid 58
2022-12-26 01:30:10,566 INFO exited: telemetry (exit status 1; not expected)
2022-12-26 01:30:10,605 WARN received SIGTERM indicating exit request
2022-12-26 01:30:10,605 INFO waiting for dependent-startup, supervisor-proc-exit-listener, rsyslogd, dialout, containercfgd to die
2022-12-26 01:30:10,624 INFO stopped: containercfgd (exit status 143)
2022-12-26 01:30:10,627 INFO stopped: dialout (terminated by SIGTERM)
2022-12-26 01:30:10,640 INFO exited: dependent-startup (exit status 3; expected)
2022-12-26 01:30:10,644 WARN received SIGTERM indicating exit request
2022-12-26 01:30:10,659 INFO stopped: rsyslogd (exit status 0)
2022-12-26 01:30:10,662 INFO stopped: supervisor-proc-exit-listener (terminated by SIGTERM)

Describe the results you expected:

Output of show version:

admin@sonic:~$ show version

SONiC Software Version: SONiC.master.193799-948ce3fe0
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: 948ce3fe0
Build date: Sun Dec 25 17:17:39 UTC 2022
Built by: AzDevOps@vmss-soni00075O

Platform: x86_64-kvm_x86_64-r0
HwSKU: Force10-S6000
ASIC: vs
ASIC Count: 1
Serial Number: N/A
Model Number: N/A
Hardware Revision: N/A
Uptime: 01:35:56 up 17 min,  3 users,  load average: 0.09, 0.15, 0.28
Date: Mon 26 Dec 2022 01:35:56

Docker images:
REPOSITORY                    TAG                       IMAGE ID       SIZE
docker-orchagent              latest                    b9bcaec8e8f0   385MB
docker-orchagent              master.193799-948ce3fe0   b9bcaec8e8f0   385MB
docker-fpm-frr                latest                    705e6eea01f3   402MB
docker-fpm-frr                master.193799-948ce3fe0   705e6eea01f3   402MB
docker-teamd                  latest                    eb8e744f0cb6   373MB
docker-teamd                  master.193799-948ce3fe0   eb8e744f0cb6   373MB
docker-macsec                 latest                    4b7f6eb66e5d   375MB
docker-dhcp-relay             latest                    2029acaf4a52   366MB
docker-eventd                 latest                    a9cf7630ed43   356MB
docker-eventd                 master.193799-948ce3fe0   a9cf7630ed43   356MB
docker-gbsyncd-vs             latest                    00733ec1c558   365MB
docker-gbsyncd-vs             master.193799-948ce3fe0   00733ec1c558   365MB
docker-sonic-p4rt             latest                    a89c30cb918b   927MB
docker-sonic-p4rt             master.193799-948ce3fe0   a89c30cb918b   927MB
docker-snmp                   latest                    628a94415933   396MB
docker-snmp                   master.193799-948ce3fe0   628a94415933   396MB
docker-platform-monitor       latest                    83a3e797848d   477MB
docker-platform-monitor       master.193799-948ce3fe0   83a3e797848d   477MB
docker-database               latest                    1001cbc516da   356MB
docker-database               master.193799-948ce3fe0   1001cbc516da   356MB
docker-sonic-telemetry        latest                    720f638a6692   655MB
docker-sonic-telemetry        master.193799-948ce3fe0   720f638a6692   655MB
docker-router-advertiser      latest                    016fa806c42c   356MB
docker-router-advertiser      master.193799-948ce3fe0   016fa806c42c   356MB
docker-mux                    latest                    975259242080   405MB
docker-mux                    master.193799-948ce3fe0   975259242080   405MB
docker-lldp                   latest                    4832ebfb3ef7   398MB
docker-lldp                   master.193799-948ce3fe0   4832ebfb3ef7   398MB
docker-nat                    latest                    f482dc0084d7   350MB
docker-nat                    master.193799-948ce3fe0   f482dc0084d7   350MB
docker-sflow                  latest                    353abc9a2e47   348MB
docker-sflow                  master.193799-948ce3fe0   353abc9a2e47   348MB
docker-sonic-mgmt-framework   latest                    747fe54b0227   476MB
docker-sonic-mgmt-framework   master.193799-948ce3fe0   747fe54b0227   476MB
docker-syncd-vs               latest                    539c9b27e6ca   345MB
docker-syncd-vs               master.193799-948ce3fe0   539c9b27e6ca   345MB

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

akarneliuk commented 1 year ago

Having exactly the same issue with the newer build:

admin@sonic:~$ show version

SONiC Software Version: SONiC.master.196462-38c5d7fce
Distribution: Debian 11.6
Kernel: 5.10.0-18-2-amd64
Build commit: 38c5d7fce
Build date: Sat Dec 31 17:45:02 UTC 2022
Built by: AzDevOps@vmss-soni000897

Platform: x86_64-kvm_x86_64-r0
HwSKU: Force10-S6000
ASIC: vs
ASIC Count: 1
Serial Number: N/A
Model Number: N/A
Hardware Revision: N/A
Uptime: 10:39:02 up 8 min,  1 user,  load average: 0.40, 0.57, 0.39
Date: Mon 02 Jan 2023 10:39:02

Docker images:
REPOSITORY                    TAG                       IMAGE ID       SIZE
docker-orchagent              latest                    38d3e1af6a37   385MB
docker-orchagent              master.196462-38c5d7fce   38d3e1af6a37   385MB
docker-fpm-frr                latest                    64241521adf8   402MB
docker-fpm-frr                master.196462-38c5d7fce   64241521adf8   402MB
docker-teamd                  latest                    24e661668d90   373MB
docker-teamd                  master.196462-38c5d7fce   24e661668d90   373MB
docker-macsec                 latest                    8faa5031262d   375MB
docker-dhcp-relay             latest                    9aad1b431469   366MB
docker-eventd                 latest                    30c7c8ed9f8c   356MB
docker-eventd                 master.196462-38c5d7fce   30c7c8ed9f8c   356MB
docker-gbsyncd-vs             latest                    6c2bdaf7a0ab   365MB
docker-gbsyncd-vs             master.196462-38c5d7fce   6c2bdaf7a0ab   365MB
docker-snmp                   latest                    c43a8bd76af4   396MB
docker-snmp                   master.196462-38c5d7fce   c43a8bd76af4   396MB
docker-sonic-p4rt             latest                    493f1c8f75bd   927MB
docker-sonic-p4rt             master.196462-38c5d7fce   493f1c8f75bd   927MB
docker-platform-monitor       latest                    c7c86a7cdfe4   477MB
docker-platform-monitor       master.196462-38c5d7fce   c7c86a7cdfe4   477MB
docker-database               latest                    de2b4884ffaa   356MB
docker-database               master.196462-38c5d7fce   de2b4884ffaa   356MB
docker-sonic-telemetry        latest                    520d13de4503   655MB
docker-sonic-telemetry        master.196462-38c5d7fce   520d13de4503   655MB
docker-router-advertiser      latest                    51f97badd14c   356MB
docker-router-advertiser      master.196462-38c5d7fce   51f97badd14c   356MB
docker-mux                    latest                    562d2d94a9e0   405MB
docker-mux                    master.196462-38c5d7fce   562d2d94a9e0   405MB
docker-lldp                   latest                    e1c574d618dc   398MB
docker-lldp                   master.196462-38c5d7fce   e1c574d618dc   398MB
docker-nat                    latest                    b442df6a3bbf   350MB
docker-nat                    master.196462-38c5d7fce   b442df6a3bbf   350MB
docker-sflow                  latest                    de1aff2548bc   348MB
docker-sflow                  master.196462-38c5d7fce   de1aff2548bc   348MB
docker-sonic-mgmt-framework   latest                    013b84c5885a   476MB
docker-sonic-mgmt-framework   master.196462-38c5d7fce   013b84c5885a   476MB
docker-syncd-vs               latest                    319ee112eaa2   345MB
docker-syncd-vs               master.196462-38c5d7fce   319ee112eaa2   345MB

show techsupport

gechiang commented 1 year ago

@qiluo-msft please help assign someone to investigate this. Thanks!

akarneliuk commented 1 year ago

Figured out, this topic appears regularly. Described solution: https://karneliuk.com/2023/01/automation-19-enabling-ocp-sonic-to-be-managed-via-gnmi-with-pygnmi/