pmem / ndctl

A "device memory" enabling project encompassing tools and libraries for CXL, NVDIMMs, DAX, memory tiering and other platform memory device topics.
Other
270 stars 139 forks source link

`cxl list -u` can't handle device serial numbers #227

Open sscargal opened 1 year ago

sscargal commented 1 year ago

When I list the memdev information, I see a serial number:

# cxl list -vvv -m mem0
[
  {
    "memdev":"mem0",
    "pmem_size":0,
    "ram_size":68719476736,
    [...snip...]
    "serial":9947034306373222400,  <<<<<<<<<<<<
    "numa_node":0,
    "host":"0000:38:00.0",
    "state":"disabled",
    "partition_info":{
      "total_size":68719476736,
      "volatile_only_size":68719476736,
      "persistent_only_size":0,
      "partition_alignment_size":0
    }
  }
]

However, when the -u (Human Readable) option is provided, this causes an out-of-range condition, so the serial number becomes bogus:

# cxl list -vvv -m mem0 -u
{
  "memdev":"mem0",
  "pmem_size":0,
  "ram_size":"64.00 GiB (68.72 GB)",
  [...snip...]
  "serial":"0x7fffffffffffffff",  <<<<<<<<<<<<
  "numa_node":0,
  "host":"0000:38:00.0",
  "state":"disabled",
  "partition_info":{
    "total_size":"64.00 GiB (68.72 GB)",
    "volatile_only_size":"64.00 GiB (68.72 GB)",
    "persistent_only_size":0,
    "partition_alignment_size":0
  }
}

The device reports a serial number of:

# lspci -s 0000:38:00.0 -vvv | grep -i serial
    Capabilities: [148 v1] Device Serial Number 8a-0a-f7-00-00-00-00-00

Converting the integer value returns the correct serial number:

(9947034306373222400)Base10 = (8A0AF70000000000)Base16

Maintaining a hex value for the serial number should resolve this. Converting it to decimal, then to human readable, has no use and makes mapping/identifying devices harder.

djbw commented 1 year ago

I wonder if this patch fixes it: patch.txt

...i.e. that the problem arises from mixing int64 and uint64 json-c APIs. Otherwise it's likely too late to create that ABI breakage for the machine readable version of the serial number.

sscargal commented 1 year ago

Sorry for the delayed response. The patch does fix the issue with -u. Thank you.

# ./cxl list -vvv -m mem0
[
  {
    "memdev":"mem0",
    "pmem_size":0,
    "ram_size":68719476736,
    [...snip...]
    "serial":9947034306373222400,
    "numa_node":0,
    "host":"0000:38:00.0",
    "state":"disabled",
    "partition_info":{
      "total_size":68719476736,
      "volatile_only_size":68719476736,
      "persistent_only_size":0,
      "partition_alignment_size":0
    }
  }
]

The serial number now displays correctly with the -u option

# ./cxl list -vvv -m mem0 -u
{
  "memdev":"mem0",
  "pmem_size":0,
  "ram_size":"64.00 GiB (68.72 GB)",
  [...snip...]
  "serial":"0x8a0af70000000000",   <<<<<< Correct
  "numa_node":0,
  "host":"0000:38:00.0",
  "state":"disabled",
  "partition_info":{
    "total_size":"64.00 GiB (68.72 GB)",
    "volatile_only_size":"64.00 GiB (68.72 GB)",
    "persistent_only_size":0,
    "partition_alignment_size":0
  }
}