sonic-net / SONiC

Landing page for Software for Open Networking in the Cloud (SONiC) - https://sonic-net.github.io/SONiC/
2.27k stars 1.14k forks source link

HLD for diagnostic monitoring of CMIS based transceivers #1828

Open mihirpat1 opened 1 month ago

mihirpat1 commented 1 month ago

This HLD while provide an overview of how SONiC reads and stores the various diagnostic parameters read from a CMIS based transceiver.

Proposed changes for existing tables in SONiC

  1. Removed CCMIS specific VDM data from DOM related tables and moved to VDM related tables
  2. Not all VDM thresholds are currently shown.
  3. With the proposed changes, I am planning to change the name of existing fields to be in line with the CMIS spec.
  4. Moved txfault, txlos, txcdrlol, rxlos and rxcdrlol from TRANSCEIVER_STATUS to TRANSCEIVER_DOM_FLAG table.
mihirpat1 commented 1 month ago

@qinchuanares - It will be great if you can help in reviewing this PR.

mihirpat1 commented 1 month ago

@Junchao-Mellanox @keboliu - It will be great if you can help in reviewing this PR.

mihirpat1 commented 1 month ago

@mihirpat1 I don't see how the timestamp, count for each alarm will be updated so that these times are as closer to the actual reporting time by the module

@prgeor I have now added Diagnostic Information Update During Link Down Event section to address this scenario.

prgeor commented 3 weeks ago

@mihirpat1 something we can use? like interval time and last time stamp when sample was collected

image

mihirpat1 commented 3 weeks ago

@mihirpat1 something we can use? like interval time and last time stamp when sample was collected

image

@prgeor I think this will be good to add. Assuming we plan to add this to all the tables (DOM + VDM + STATUS) with their corresponding real time value, threshold value and the corresponding flag relevant tables.