galexrt / dellhw_exporter

Prometheus exporter for Dell Hardware components using Dell OMSA.
https://dellhw-exporter.galexrt.moe
Apache License 2.0
119 stars 41 forks source link

dell_hw_nic_status metric returning 1 for functional interfaces #94

Closed B0go closed 9 months ago

B0go commented 9 months ago

Hi @galexrt, thanks for this great contribution to the community. Such an exporter is essential for Dell based on-prem observability!

While analyzing the values returned by the exporter from a couple of different servers, I noticed that some metrics were returning 1s for network interfaces that were known to be functional. After digging into the code, I found that it only marks an interface as functional (0) when the report either returns "Connected" or "Full" in the fifth field. While this covers most scenarios, some edge cases are not covered.

Here is an example of a Team Interface that currently shows 1 in the metric but is functional:

Index             : 1
Interface Name    : br0
Vendor            : Linux
Description       : Network Bridge
Redundancy Status : Not Applicable

I believe this happens because "Not Applicable" is not covered by the exporter logic. It would be a good improvement to also check the field name before matching its value.

I can open a Pull Request including this scenario!

galexrt commented 9 months ago

@B0go A pull request to fix this would be appreciated. Please add a test for this case as well, thanks!

Let me know if you need any guidance.

B0go commented 9 months ago

@galexrt I will work on this whenever possible, thanks.

galexrt commented 9 months ago

I have just released v1.13.2 containing your PR, see https://github.com/galexrt/dellhw_exporter/releases/tag/v1.13.2.

Please let me know if your fix works as expected.

B0go commented 9 months ago

Confirmed!

Screenshot 2024-02-07 at 10 15 36