sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
717 stars 1.38k forks source link

[master][voq][chassis] swss.sh "systemctl start" dhcp_relay failed and log error which causes logAnalyze failure in OC test #18822

Open mlok-nokia opened 4 months ago

mlok-nokia commented 4 months ago

Description

on Master branch, dhcp_relay is not supported in VOQ chassis. It is disabled in the FEATURE table. But based on the dependency, swss.sh always call "systemctl start" it although it's service file has been masked/disabled. The following error is logged in syslog which causes the logAnalyze failed on some of the OC tests.

Apr 27 23:15:14.472425 ixre-egl-board7 ERR systemctl[7299]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
Apr 27 23:15:14.476897 ixre-egl-board7 ERR systemctl[7298]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.

These logs show as ERR in the master branch which is use the newer version of kernel while they are shown as "INFO" in the 202205 branch.

Steps to reproduce the issue:

  1. Using Master branch image, reboot the system or config reload. Check syslog. The following logs exist.
    Apr 27 23:15:14.472425 ixre-egl-board7 ERR systemctl[7299]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
    Apr 27 23:15:14.476897 ixre-egl-board7 ERR systemctl[7298]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.

Describe the results you received:

The syslog contain the error log as below when system bootup or execute the config reload.

Apr 27 23:15:14.472425 ixre-egl-board7 ERR systemctl[7299]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
Apr 27 23:15:14.476897 ixre-egl-board7 ERR systemctl[7298]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.

Describe the results you expected:

There should not be such error log in syslog.

Output of show version:

Latest image.

(paste your output here)

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

rlhui commented 4 months ago

@kellyyeh would you please check this?

rlhui commented 4 months ago

was this syslog INFO in 202205 but now it's ERR, and it triggered sonic-mgmt errors?

neethajohn commented 4 months ago

Issue seen only on master branch

mlok-nokia commented 4 months ago

was this syslog INFO in 202205 but now it's ERR, and it triggered sonic-mgmt errors?

Yes. This message is shown as INFO in 202205 branch as below:

May  7 23:10:38.902883 ixre-egl-board4 INFO swss.sh[6407]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.
May  7 23:10:38.907071 ixre-egl-board4 INFO swss.sh[6406]: Failed to start dhcp_relay.service: Unit dhcp_relay.service is masked.

But they are shown as ERR in the master branch. This triggers the LogAnalyze fails OC test in sonic-mgmt run

wenyiz2021 commented 3 months ago

@mlok-nokia your PR https://github.com/sonic-net/sonic-buildimage/pull/18829#pullrequestreview-2092712267 should fix this issue, is that correct?

mlok-nokia commented 2 months ago

@mlok-nokia your PR #18829 (review) should fix this issue, is that correct?

Yes. correct.

abdosi commented 4 weeks ago

@StormLiangMS : Do we see this issue on T1 platforms ?