sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
729 stars 1.4k forks source link

[CMIS manager] CMIS state machine flap once due to apply SI parameter #17303

Open Junchao-Mellanox opened 10 months ago

Junchao-Mellanox commented 10 months ago

Description

I found an unnecessary CMIS state machine flap after inserting a cable. I assume the issue is a known community issue which shall be fixed by HLD and related PRs: https://github.com/sonic-net/SONiC/pull/1432 . The flapping causes an outstanding delay (~10 seconds) in terms of port link up time.

Detail in log:

Here is the detail:

a.  Xcvrd got insert event:

Nov 22 05:35:17.618109 sonic NOTICE pmon#xcvrd: Ethernet144: Got SFP inserted event

b.  Cable become ready before applying SI parameters (unnecessary link flap because SI parameters was not applied):

Nov 22 05:35:19.445697 sonic NOTICE pmon#xcvrd: CMIS: Ethernet144: 400G, lanemask=0xff, state=DP_ACTIVATION, appl 1 host_lane_count 8 retries=0
Nov 22 05:35:19.448715 sonic NOTICE pmon#xcvrd: CMIS: Ethernet144: READY

c.  Applying SI parameters took 2 seconds from inserted event.

Nov 22 05:35:19.924017 sonic NOTICE swss#orchagent: :- doPortTask: Set port Ethernet144 SI settings is successful

e.  Seems CMIS manager restarted CMIS state machine after applying SI parameters, and it took 9 seconds to be ready again:

Nov 22 05:35:20.185803 sonic NOTICE pmon#xcvrd: CMIS: Ethernet144: 400G, lanemask=0xff, state=INSERTED, appl 1 host_lane_count 8 retries=0
…
Nov 22 05:35:29.128229 sonic NOTICE pmon#xcvrd: CMIS: Ethernet144: READY

f.  After CMIS state machine ready, port oper status turn to up:

Nov 22 05:35:45.369482 sonic NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet144 oper state set from down to up

Steps to reproduce the issue:

insert a CMIS optic cable

Describe the results you received:

CMIS state machine reach READY twice

Describe the results you expected:

CMIS state machine reach READ once

Output of show version:

(paste your output here)

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

liat-grozovik commented 7 months ago

@mihirpat1 what is the ETA for having this issue fixed? @prgeor FYI

liat-grozovik commented 6 months ago

kindly reminder to provide ETA for the fix. @mihirpat1

mihirpat1 commented 6 months ago

@liat-grozovik - ETA for the fix - 6/28