sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
736 stars 1.42k forks source link

[ntpsec] The restart of ntpsec service on config reload causes dhclient ntpsec exit hooks to fail #19905

Open gpunathilell opened 2 months ago

gpunathilell commented 2 months ago

Description

The log error (shown below) is seen due to the interaction between ntpsec and dhclient during specific points of config reload/ reboot, during reload/reboot the ntpsec service is restarted, also in the exit hooks for dhclient the ntpsec server is restarted if there is any change in ntp servers (server is added/ removed) ERR root: /etc/dhcp/dhclient-exit-hooks.d/ntpsec returned non-zero exit status 1

Steps to reproduce the issue:

This issue can be reproduced while running dhclient (assuming there is a change in NTP servers) and then performing config reload on seeing the Starting ntpsec.service - Network Time Service in the logs - this service start is happening from dhclient, and when we perform config reload the ntpsec service is stopped, causing service restart in dhclient exit hooks to fail and the log error is seen in the logs Please find a simple script to reproduce the issue:

#!/bin/bash
command2="config reload -y"
# dhclient command to introduce change in NTP servers, replace with any other command if required
command1="dhclient -r eth0"
log_f="/var/log/syslog"
msg="Starting ntpsec.service - Network Time Service"

run_first_command(){
    echo "Running first"
    eval "$command1" | sed 's/^/dhcl: /'
}

monitor_syslog_and_run_second_command(){
    echo "Monitoring syslog for message $msg"
    tail -f "$log_f" | grep --line-buffered "$msg" | while read -r line; do
        echo "Msg detected in syslog $line"
        eval "$command2"
        pkill -P $$ tail
        break
    done
}
#Wait for message to appear first in background
monitor_syslog_and_run_second_command &
#Run dhclient command
run_first_command
#Sleep until config reload is complete, change if necessary
sleep 100

Describe the results you received:

ERR root: /etc/dhcp/dhclient-exit-hooks.d/ntpsec returned non-zero exit status 1 in syslog

Describe the results you expected:

No log errors related to ntpsec exit hooks

Additional Information:

ntpsec is restarted at 3 different locations: During config reload/reboot from systemd: here During start of ntp-config.service: here When ntp server related information received from dhclient (assuming dhcp is enabled) - similar exit hooks is generated in sonic like here If start and stop of ntpsec at any of the two locations (dhclient based restart and either ntp-config.service restart or config reload/reboot based restart of ntpsec) happens the log error is present. ntp-config.service and ntpsec.service are binded to sonic.target

zjswhhh commented 2 months ago

This seems to be related to upgrade tobookworm, @saiarcot895 - can you take a quick look first and triage?