sonic-net / sonic-platform-daemons

Platform module daemons for SONiC
Other
25 stars 159 forks source link

Bug fix: Sensormond crashes on elapsed time limit warning log #434

Closed gregoryboudreau closed 9 months ago

gregoryboudreau commented 9 months ago

Changes call to log_warning that sensormond daemon uses to do it directly through the object rather than attempting to use logger which will crash the daemon as it is not a part of the object.

Description

Changes call to log_warning that sensormond daemon uses to do it directly through the object rather than attempting to use logger which will crash the daemon as it is not a part of the object.

Motivation and Context

In its current form, if the elapsed time log warning is ever attempted, the sensormond daemon will hit an error causing it to crash and restart, this will happen in an infinite loop.

How Has This Been Tested?

This change has been tested on Cisco 8201-32fh and 8111-32eh where we can see the sensormond process staying up and the logs being printed into the syslog correctly.

Before change:

Jul 17 09:31:24.572615 sonic INFO pmon#supervisord: sensormond Traceback (most recent call last):
Jul 17 09:31:24.572615 sonic INFO pmon#supervisord: sensormond   File "/usr/local/bin/sensormond", line 529, in <module>
Jul 17 09:31:24.572884 sonic INFO pmon#supervisord: sensormond     sys.exit(main())
Jul 17 09:31:24.572884 sonic INFO pmon#supervisord: sensormond   File "/usr/local/bin/sensormond", line 520, in main
Jul 17 09:31:24.573154 sonic INFO pmon#supervisord: sensormond     while sensor_control.run():
Jul 17 09:31:24.573154 sonic INFO pmon#supervisord: sensormond   File "/usr/local/bin/sensormond", line 507, in run
Jul 17 09:31:24.573395 sonic INFO pmon#supervisord: sensormond     self.logger.log_warning('Sensors update took a long time : '
Jul 17 09:31:24.573395 sonic INFO pmon#supervisord: sensormond AttributeError: 'SensorMonitorDaemon' object has no attribute 'logger'
Jul 17 09:31:24.670676 sonic INFO pmon#supervisord 2023-07-17 09:31:24,670 INFO exited: sensormond (exit status 1; not expected)
Jul 17 09:31:25.674833 sonic INFO pmon#supervisord 2023-07-17 09:31:25,673 INFO spawned: 'sensormond' with pid 3463

After change:

Jul 16 16:35:16.115889 sonic WARNING pmon#sensormond: Sensors update took a long time : 115.69240474700928 seconds

Additional Information (Optional)

gregoryboudreau commented 9 months ago

Addressed as part of: https://github.com/sonic-net/sonic-platform-daemons/pull/438