linux-nvme / nvme-cli

NVMe management command line interface.
https://nvmexpress.org
GNU General Public License v2.0
1.49k stars 659 forks source link

Field descriptors in OCP SMART Log Page are different to those in standard SMART Log Page #2577

Open sbates130272 opened 6 days ago

sbates130272 commented 6 days ago

There is a lack of consistency between the field descriptors in the standard SMART log page (which uses all lower case and underscores) and the OCP SMART extended log page (which uses mixed cases and white spaces). This is rather annoying and also poses a bit of a challenge when trying to scrape the OCP metrics via a Prometheus node-exporter text collector.

I think it would be lovely to alter the format of the OCP output (both stdout and JSON) but I am concerned that might cause an issue for people who have already written tools that parse the existing log page output. So I am creating this issue so we can discuss if such a change is welcome or not.

$ nvme smart-log /dev/nvme0n1 -o json
{
  "critical_warning":0,
  "temperature":323,
  <snip>
}

versus

$ nvme ocp smart-add-log /dev/nvme0n1 -o json
{
  <snip>
  "Bad user nand blocks - Raw":0,
  "Bad user nand blocks - Normalized":0,
  <snip>
}

Note this is somewhat related to #2189 but is different enough that I wanted to create a new issue. Depending on how this resolves it may allow us to close the other issue.

igaw commented 1 day ago

Changing the existing output is likely to cause regression, which I really like to avoid. We could introduce a output format version flag so the user can decide which version should be used. Luckily we have a plugin interface for this already in place.

Obvious it means additional maintenance overhead. When we do the next major version update we just rip out the older version then.

sbates130272 commented 21 hours ago

Note there is a conversation going on in prometheus-community/node-exporter-textfile-collector-scripts#228 that also deals with JSON output format changes and some issues around per controller and per namespace SMART log pages.

sbates130272 commented 21 hours ago

@igaw I will try and put together a PR for this in the next few days. I've noted your comments above and will do something that aligns with those.

igaw commented 19 hours ago

Cool. You don't need to go overboard in the first version. I think we have to figure out how to make this maintainable somehow without duplicating too much.