sonic-net / sonic-platform-daemons

Platform module daemons for SONiC
Other
25 stars 159 forks source link

[202405] [xcvrd] change log content and log level when application is not found #524

Closed Junchao-Mellanox closed 3 months ago

Junchao-Mellanox commented 4 months ago

Backport https://github.com/sonic-net/sonic-platform-daemons/pull/503

Description

Change log level to notice and add more information when application is not found for a module.

Motivation and Context

In existing setups, there might be ports that were set to admin down. Those ports may not have correct speed configuration which would cause error logs when applying media settings:

Jun 17 20:27:58.041567 sonic ERR pmon#xcvrd: Failed to get desired application from {1: {'host_electrical_interface_id': 'IB HDR (Arch.Spec.Vol.2)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 4, 'host_lane_count': 4, 'host_lane_assignment_options': 17}, 2: {'host_electrical_interface_id': 'IB SDR (Arch.Spec.Vol.2)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 4, 'host_lane_count': 4, 'host_lane_assignment_options': 17}, 3: {'host_electrical_interface_id': '200GBASE-CR4 (Clause 136)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 4, 'host_lane_count': 4, 'host_lane_assignment_options': 17}, 4: {'host_electrical_interface_id': '100GBASE-CR2 (Clause 136)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 2, 'host_lane_count': 2, 'host_lane_assignment_options': 85}, 5: {'host_electrical_interface_id': '100GBASE-CR4 (Clause 92)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 4, 'host_lane_count': 4, 'host_lane_assignment_options': 17}, 6: {'host_electrical_interface_id': '50GBASE-CR2 (Ethernet Technology Consortium) with no FEC', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 2, 'host_lane_count': 2, 'host_lane_assignment_options': 85}, 7: {'host_electrical_interface_id': '50GBASE-CR (Clause 126)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 1, 'host_lane_count': 1, 'host_lane_assignment_options': 255}, 8: {'host_electrical_interface_id': '25GBASE-CR CA-N (Clause 110)', 'module_media_interface_id': 'Copper cable', 'media_lane_count': 1, 'host_lane_count': 1, 'host_lane_assignment_options': 255}}

So, here is the thing:

  1. For media settings, we just print NOTICE log when application is not found so that we don't have to force user to correct port configuration when port is admin down
  2. For CMIS state machine, we print NOTICE log here and there is another place which will print ERROR log when application is not found. We will not loss debuggablity.
if state == CMIS_STATE_INSERTED:
      self.port_dict[lport]['appl'] = get_cmis_application_desired(api, host_lane_count, host_speed)
      if self.port_dict[lport]['appl'] is None:
          self.log_error("{}: no suitable app for the port appl {} host_lane_count {} "
                          "host_speed {}".format(lport, appl, host_lane_count, host_speed))

So, I think it is safe to change the log level to NOTICE

How Has This Been Tested?

Manual test

Additional Information (Optional)