aristanetworks / sonic

Open source drivers and initialization library for Arista platforms running SONiC
GNU General Public License v2.0
22 stars 30 forks source link

[chassis] system-health commands and monitoring failing on the linecards #41

Closed arlakshm closed 2 years ago

arlakshm commented 2 years ago

The cli commands to check the system-health are all failing on 100G linecard. Error callstack below

admin@str2-7804-lc7-1:~$ sudo show system-health detail                                                                                                                                                                                                                      Failed to set system led due to - KeyError('status')
Traceback (most recent call last):
  File "/usr/local/bin/show", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/show/system_health.py", line 103, in detail
    led = chassis.get_status_led()
  File "/usr/lib/python3/dist-packages/arista/utils/sonic_platform/chassis.py", line 148, in get_status_led
    return self._inventory.getLed('status').getColor()
  File "/usr/lib/python3/dist-packages/arista/core/metainventory.py", line 60, in callback
    return callbackItem(*args)
  File "/usr/lib/python3/dist-packages/arista/core/metainventory.py", line 56, in callbackItem
    raise KeyError(*args)
KeyError: 'status'
admin@str2-7804-lc7-1:~$ sudo show system-health summary
Failed to set system led due to - KeyError('status')
Traceback (most recent call last):
  File "/usr/local/bin/show", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/dist-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/show/system_health.py", line 43, in summary
    led = chassis.get_status_led()
  File "/usr/lib/python3/dist-packages/arista/utils/sonic_platform/chassis.py", line 148, in get_status_led
    return self._inventory.getLed('status').getColor()
  File "/usr/lib/python3/dist-packages/arista/core/metainventory.py", line 60, in callback
    return callbackItem(*args)
  File "/usr/lib/python3/dist-packages/arista/core/metainventory.py", line 56, in callbackItem
    raise KeyError(*args)
KeyError: 'status'
Staphylo commented 2 years ago

There are 2 issues related to system-health failure on linecards.

1) The one you reported where we do not populate the status led. I have a fix populating it internally that I'll make publicly available this or next week. Another change will be required to make the linecard status led working properly.

2) The system_health_monitoring_config.json content is not correct for linecards This is addressed as part of https://github.com/Azure/sonic-buildimage/pull/10749

I'll update this issue once the fix for 1. is available

Staphylo commented 2 years ago

Issue number 1 will be fixed by Azure/sonic-buildimage#10800

Staphylo commented 2 years ago

PR merged