sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
723 stars 1.38k forks source link

S6100-T1-'telemetry' process is not running #4076

Open mini-nair-dell opened 4 years ago

mini-nair-dell commented 4 years ago

The below logs appear in the switch while running regression test on S6100-T1-lag-64 topology

"Jan 27 21:14:25.500634 sonic-s6100-10 ERR monit[536]: 'telemetry' process is not running", "Jan 27 21:14:25.522567 sonic-s6100-10 ERR monit[536]: 'dialout_client' process is not running", "Jan 27 21:14:25.544090 sonic-s6100-10 ERR monit[536]: 'sflowmgrd' process is not running", "Jan 27 21:15:25.612909 sonic-s6100-10 ERR monit[536]: 'telemetry' process is not running", "Jan 27 21:15:25.635344 sonic-s6100-10 ERR monit[536]: 'dialout_client' process is not running", "Jan 27 21:15:25.657245 sonic-s6100-10 ERR monit[536]: 'sflowmgrd' process is not running", "Jan 27 21:16:25.700901 sonic-s6100-10 ERR monit[536]: 'telemetry' process is not running", "Jan 27 21:16:25.723837 sonic-s6100-10 ERR monit[536]: 'dialout_client' process is not running", "Jan 27 21:16:25.746280 sonic-s6100-10 ERR monit[536]: 'sflowmgrd' process is not running",

This issue is seen from the build 172. In 168, we were seeing the issue - #3986

Attached the syslogs

Thanks Mini

xinliu-seattle commented 4 years ago

@padmanarayana to take a look at sflow part. @Pradiya to take a look at the telemetry part.

mini-nair-dell commented 4 years ago

We don’t see the log wrt sflow. Here is the o/p

root@sonic-s6100-10:/var/log# cat syslog | grep "sflowmgrd" Apr 15 10:56:33.510557 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- main: --- Starting sflowmgrd --- Apr 15 10:56:33.519021 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- loadRedisScript: lua script loaded, sha: 88270a7c5c90583e56425aca8af8a4b8c39fe757 Apr 15 10:56:33.967162 sonic-s6100-10 INFO sflow#supervisord: start.sh sflowmgrd: started Apr 15 10:56:40.357940 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:56:32,848 INFO spawned: 'sflowmgrd' with pid 19 Apr 15 10:56:40.357940 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:56:33,960 INFO success: sflowmgrd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs) Apr 15 10:57:55.636954 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:56:51,978 INFO waiting for sflowmgrd, rsyslogd, supervisor-proc-exit-listener to die Apr 15 10:57:55.636978 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:56:51,978 INFO stopped: sflowmgrd (terminated by SIGTERM) Apr 15 10:57:57.392254 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- main: --- Starting sflowmgrd --- Apr 15 10:57:57.392485 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- loadRedisScript: lua script loaded, sha: 88270a7c5c90583e56425aca8af8a4b8c39fe757 Apr 15 10:57:58.389432 sonic-s6100-10 INFO sflow#supervisord: start.sh sflowmgrd: started Apr 15 10:58:05.645852 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:57:57,374 INFO spawned: 'sflowmgrd' with pid 18 Apr 15 10:58:05.645852 sonic-s6100-10 INFO sflow#supervisord 2020-04-15 10:57:58,377 INFO success: sflowmgrd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

root@sonic-s6100-10:/var/log# config sflow enable Created symlink /etc/systemd/system/multi-user.target.wants/sflow.service → /etc/systemd/system/sflow.service. root@sonic-s6100-10:/var/log# cat syslog | grep "process is not running" root@sonic-s6100-10:/var/log# cat syslog | grep "process is not running" root@sonic-s6100-10:/var/log# cat syslog | grep "process is not running" root@sonic-s6100-10:/var/log# cat syslog | grep "process is not running" Apr 15 11:00:49.613490 sonic-s6100-10 ERR monit[514]: 'telemetry' process is not running root@sonic-s6100-10:/var/log# cat syslog | grep "process is not running" Apr 15 11:00:49.613490 sonic-s6100-10 ERR monit[514]: 'telemetry' process is not running

root@sonic-s6100-10:/var/log# cat syslog | grep "sflowmgrd" Apr 15 11:00:12.123202 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- sflowHandleService: Starting hsflowd service root@sonic-s6100-10:/var/log# cat syslog | grep "sflowmgrd" Apr 15 11:00:12.123202 sonic-s6100-10 NOTICE sflow#sflowmgrd: :- sflowHandleService: Starting hsflowd service root@sonic-s6100-10:/var/log# show ip bgp summ

IPv4 Unicast Summary: BGP router identifier 10.1.0.32, local AS number 65100 vrf-id 0 BGP table version 12847 RIB entries 13051, using 2345 KiB of memory Peers 24, using 490 KiB of memory Peer groups 2, using 128 bytes of memory

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd NeighborName 10.0.0.1 4 65200 3250 3294 0 0 0 00:02:13 6402 ARISTA01T2 10.0.0.5 4 65200 3249 3293 0 0 0 00:02:12 6402 ARISTA03T2 10.0.0.9 4 65200 3248 3293 0 0 0 00:02:10 6402 ARISTA05T2 10.0.0.13 4 65200 3250 3297 0 0 0 00:02:13 6402 ARISTA07T2 10.0.0.33 4 64001 49 3293 0 0 0 00:02:10 6 ARISTA01T0 10.0.0.35 4 64002 50 3294 0 0 0 00:02:13 6 ARISTA02T0 10.0.0.37 4 64003 51 3294 0 0 0 00:02:13 7 ARISTA03T0 10.0.0.39 4 64004 50 3294 0 0 0 00:02:13 6 ARISTA04T0 10.0.0.41 4 64005 51 3294 0 0 0 00:02:13 7 ARISTA05T0 10.0.0.43 4 64006 50 3294 0 0 0 00:02:13 6 ARISTA06T0 10.0.0.45 4 64007 50 3294 0 0 0 00:02:13 6 ARISTA07T0 10.0.0.47 4 64008 50 3294 0 0 0 00:02:13 6 ARISTA08T0 10.0.0.49 4 64009 49 3293 0 0 0 00:02:10 6 ARISTA09T0 10.0.0.51 4 64010 49 3293 0 0 0 00:02:10 6 ARISTA10T0 10.0.0.53 4 64011 50 3294 0 0 0 00:02:13 6 ARISTA11T0 10.0.0.55 4 64012 49 3293 0 0 0 00:02:10 6 ARISTA12T0 10.0.0.57 4 64013 49 3293 0 0 0 00:02:10 6 ARISTA13T0 10.0.0.59 4 64014 50 3294 0 0 0 00:02:13 6 ARISTA14T0 10.0.0.61 4 64015 50 3294 0 0 0 00:02:13 6 ARISTA15T0 10.0.0.63 4 64016 50 3294 0 0 0 00:02:13 6 ARISTA16T0 10.0.0.65 4 64017 49 3293 0 0 0 00:02:10 6 ARISTA17T0 10.0.0.67 4 64018 50 3294 0 0 0 00:02:13 6 ARISTA18T0 10.0.0.69 4 64019 49 3293 0 0 0 00:02:10 6 ARISTA19T0 10.0.0.71 4 64020 50 3294 0 0 0 00:02:13 6 ARISTA20T0

Total number of neighbors 24 root@sonic-s6100-10:/var/log# show ver

SONiC Software Version: SONiC.HEAD.253-2872d802 Distribution: Debian 9.12 Kernel: 4.9.0-11-2-amd64 Build commit: 2872d802 Build date: Tue Apr 14 09:17:49 UTC 2020 Built by: johnar@jenkins-worker-8

Platform: x86_64-dell_s6100_c2538-r0 HwSKU: Force10-S6100-T1 ASIC: broadcom Serial Number: FXGSG02 Uptime: 11:00:42 up 4 min, 1 user, load average: 2.77, 2.65, 1.24

Docker images: REPOSITORY TAG IMAGE ID SIZE docker-syncd-brcm HEAD.253-2872d802 5d55b19fa1e6 437MB docker-syncd-brcm latest 5d55b19fa1e6 437MB docker-sonic-mgmt-framework HEAD.253-2872d802 96fecadbb456 428MB docker-sonic-mgmt-framework latest 96fecadbb456 428MB docker-router-advertiser HEAD.253-2872d802 ea8439a76c74 289MB docker-router-advertiser latest ea8439a76c74 289MB docker-platform-monitor HEAD.253-2872d802 a7ccd7a90189 336MB docker-platform-monitor latest a7ccd7a90189 336MB docker-lldp-sv2 HEAD.253-2872d802 ba1bdd631cfd 306MB docker-lldp-sv2 latest ba1bdd631cfd 306MB docker-dhcp-relay HEAD.253-2872d802 c1b5f3423c9f 299MB docker-dhcp-relay latest c1b5f3423c9f 299MB docker-database HEAD.253-2872d802 5f0d473b140c 289MB docker-database latest 5f0d473b140c 289MB docker-orchagent HEAD.253-2872d802 71fbc3a0d895 328MB docker-orchagent latest 71fbc3a0d895 328MB docker-nat HEAD.253-2872d802 489b4c166fcc 309MB docker-nat latest 489b4c166fcc 309MB docker-sonic-telemetry HEAD.253-2872d802 5905b3b6ba14 348MB docker-sonic-telemetry latest 5905b3b6ba14 348MB docker-fpm-frr HEAD.253-2872d802 171dc5ebe1db 328MB docker-fpm-frr latest 171dc5ebe1db 328MB docker-sflow HEAD.253-2872d802 94f4ed150557 309MB docker-sflow latest 94f4ed150557 309MB docker-iccpd HEAD.253-2872d802 34a47e64d087 309MB docker-iccpd latest 34a47e64d087 309MB docker-snmp-sv2 HEAD.253-2872d802 2357c4e16738 345MB docker-snmp-sv2 latest 2357c4e16738 345MB docker-teamd HEAD.253-2872d802 b63c1a64547d 308MB docker-teamd latest b63c1a64547d 308MB

Thanks