prometheus-community / node-exporter-textfile-collector-scripts

Scripts for node-exporter's textfile collector
Apache License 2.0
512 stars 191 forks source link

nvme-cli v2.11 breaks compatibility with nvme textfile collector #226

Closed dswarbrick closed 4 days ago

dswarbrick commented 6 days ago

nvme-cli v2.11 now defaults to outputting verbose JSON output, with apparently no way to revert back to the previous "terse" output.

For the command nvme list -o json, the previous output (with nvme-cli v2.10) resembled the following:

{
  "Devices":[
    {
      "NameSpace":1,
      "DevicePath":"/dev/nvme0n1",
      "GenericPath":"/dev/ng0n1",
      "Firmware":"3B4QFXO7",
      "ModelNumber":"Samsung SSD 980 500GB",
      "SerialNumber":"S64D...",
      "UsedBytes":46430703616,
      "MaximumLBA":976773168,
      "PhysicalSize":500107862016,
      "SectorSize":512
    }
  ]
}

With nvme-cli v2.11, that exact same command now results in this:

{
  "Devices":[
    {
      "HostNQN":"nqn.2014-08.org.nvmexpress:uuid:235adbb6-...",
      "HostID":"12718199-...",
      "Subsystems":[
        {
          "Subsystem":"nvme-subsys0",
          "SubsystemNQN":"nqn.1994-11.com.samsung:nvme:980M.2:S64D...",
          "Controllers":[
            {
              "Controller":"nvme0",
              "Cntlid":"5",
              "SerialNumber":"S64D...",
              "ModelNumber":"Samsung SSD 980 500GB",
              "Firmware":"3B4QFXO7",
              "Transport":"pcie",
              "Address":"0000:02:00.0",
              "Slot":"",
              "Namespaces":[
                {
                  "NameSpace":"nvme0n1",
                  "Generic":"ng0n1",
                  "NSID":1,
                  "UsedBytes":46439002112,
                  "MaximumLBA":976773168,
                  "PhysicalSize":500107862016,
                  "SectorSize":512
                }
              ],
              "Paths":[]
            }
          ],
          "Namespaces":[]
        }
      ]
    }
  ]
}

This obviously breaks the nvme textfile collector, as it is not expecting to parse the verbose output.

As far as I can tell, this was annotated in the v2.11 release notes as:

  nvme-print-json: display only verbose output

It's a bit disheartening when tools force a change like this in machine-readable output (such as JSON), as it is highly likely that the output is being consumed by some other script or application, which expects the output to be in a particular structure.

I'll leave this here as a reminder / placeholder. I'll have to try to make some time to refactor the nvme_metrics.py collector. Hopefully we can avoid having to support two different JSON formats, since the verbose output can be produced by older nvme-cli versions by appending --verbose to the command line options. This is probably the right time to finally drop the older nvme_metrics.sh collector too.

cc: @SuperQ