sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
723 stars 1.38k forks source link

[Telemetry]Sometimes there are no logs from telemetry container after boot #17172

Open dgsudharsan opened 10 months ago

dgsudharsan commented 10 months ago

Description

This is seen few times when running test_critical_process_monitoring where telemetry process on killing doesn't have any logs from proc exit listener.

Steps to reproduce the issue:

  1. kill telemetry process in telemetry container
  2. Expect Nov 9 22:44:40.002236 sonic ERR telemetry#supervisor-proc-exit-listener: Process 'telemetry' is not running in namespace 'host' (1.0 minutes).

Describe the results you received:

No log is seen. Not even terminated by SIGKILL; not expected is seen

This issue is same as https://github.com/sonic-net/sonic-buildimage/issues/15152 but seen on telemetry container

Describe the results you expected:

See the above logs.

Output of show version:

(paste your output here)

Output of show techsupport:

SONiC Software Version: SONiC.202305_RC.21-cbe2c014b_Internal
SONiC OS Version: 11
Distribution: Debian 11.8
Kernel: 5.10.0-23-2-amd64
Build commit: de69d986c
Build date: Thu Nov  9 14:06:00 UTC 2023
Built by: sw-r2d2-bot@r-build-sonic-ci03-242

Platform: x86_64-mlnx_msn2010-r0
HwSKU: ACS-MSN2010
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1749X10061
Model Number: MSN2010-CB2F
Hardware Revision: A1
Uptime: 22:52:30 up  2:03,  2 users,  load average: 1.45, 2.68, 3.34
Date: Thu 09 Nov 2023 22:52:30

Docker images:
REPOSITORY                                         TAG                               IMAGE ID       SIZE
docker-syncd-mlnx                                  202305_RC.21-cbe2c014b_Internal   306ff884607c   836MB
docker-syncd-mlnx                                  latest                            306ff884607c   836MB
docker-platform-monitor                            202305_RC.21-cbe2c014b_Internal   9f4aecb32eb6   828MB
docker-platform-monitor                            latest                            9f4aecb32eb6   828MB
urm.nvidia.com/sw-nbu-sws-sonic-docker/sonic-wjh   1.6.0-202305-7                    0f4dca74c945   433MB
docker-orchagent                                   202305_RC.21-cbe2c014b_Internal   b85f8e9d2567   328MB
docker-orchagent                                   latest                            b85f8e9d2567   328MB
docker-fpm-frr                                     202305_RC.21-cbe2c014b_Internal   5c43c09c8ee9   348MB
docker-fpm-frr                                     latest                            5c43c09c8ee9   348MB
docker-nat                                         202305_RC.21-cbe2c014b_Internal   711cc90a809b   320MB
docker-nat                                         latest                            711cc90a809b   320MB
docker-sflow                                       202305_RC.21-cbe2c014b_Internal   892f60ec53a8   318MB
docker-sflow                                       latest                            892f60ec53a8   318MB
docker-teamd                                       202305_RC.21-cbe2c014b_Internal   ea8cc2c42c33   317MB
docker-teamd                                       latest                            ea8cc2c42c33   317MB
docker-macsec                                      latest                            97c101398305   319MB
docker-dhcp-relay                                  latest                            ce1df3511f73   307MB
docker-eventd                                      202305_RC.21-cbe2c014b_Internal   81fb3bce71f9   299MB
docker-eventd                                      latest                            81fb3bce71f9   299MB
docker-sonic-telemetry                             202305_RC.21-cbe2c014b_Internal   faa54c27d516   386MB
docker-sonic-telemetry                             latest                            faa54c27d516   386MB
docker-snmp                                        202305_RC.21-cbe2c014b_Internal   7911150e4f0a   338MB
docker-snmp                                        latest                            7911150e4f0a   338MB
docker-lldp                                        202305_RC.21-cbe2c014b_Internal   8682586a3df2   341MB
docker-lldp                                        latest                            8682586a3df2   341MB
docker-database                                    202305_RC.21-cbe2c014b_Internal   41fd7b88a05b   299MB
docker-database                                    latest                            41fd7b88a05b   299MB
docker-router-advertiser                           202305_RC.21-cbe2c014b_Internal   f3509bfdcdec   299MB
docker-router-advertiser                           latest                            f3509bfdcdec   299MB
docker-mux                                         202305_RC.21-cbe2c014b_Internal   654d73279524   348MB
docker-mux                                         latest                            654d73279524   348MB
docker-sonic-mgmt-framework                        202305_RC.21-cbe2c014b_Internal   ec4f523b0fea   415MB
docker-sonic-mgmt-framework                        latest                            ec4f523b0fea   415MB

Additional information you deem important (e.g. issue happens only occasionally):

sysdump_sonic_dump_r-boxer-sw01_20231109_225158.tar.gz test_critical_process_logs.txt

arlakshm commented 9 months ago

https://github.com/sonic-net/sonic-utilities/pull/3039 will enable use to get the rsyslog.conf when this issue is seen.

If this issue is seen again, this issue can be updated with findings regarding the contents of rsyslog.conf in the container.

keboliu commented 6 months ago

Hi @qiluo-msft do you have a plan to fix this issue?