open-mpi / hwloc

Hardware locality (hwloc)
https://www.open-mpi.org/projects/hwloc
Other
561 stars 173 forks source link

CXL detection #554

Open bgoglin opened 1 year ago

bgoglin commented 1 year ago

Starting with 2.9 (or 2.8.1) CXL memory expanders (Type 3) are detected and non-ignored (they have a dedicated PCI class 0x0502). It's not clear whether CXL Type 2 (accelerators with coherent memory) or Type 1 (devices with coherent access to host memory) will have a dedicated PCI class. If not, we'll need to dig into sysfs and/or the (non-root) PCI config space to find out whether a PCI Bridge or Device is CXL or not.

For now, they are exposed as normal PCI objects. We could switch them to CXL objects instead, either with a "CXL" subtype (for PCIDev and bridges?) and/or with a CXL upstream/downstream type (in bridges only). It's not clear whether it's useful/important or not.

bgoglin commented 1 year ago

Notes for putting CXL info in DAX and NUMA nodes:

For CXL RAM: /sys/bus/dax/devices/dax0.0 points to something like ../../../devices/platform/ACPI0017:00/root0/decoder0.0/region0/dax_region0/dax0.0/ ("/decoder" might be an easy to detect CXL here). cat /sys/bus/cxl/devices/region0/target0 (and possibly target1... if interleaving) => "decoder4.0" /sys/bus/cxl/devices/decoder4.0 points to ../../../devices/platform/ACPI0017:00/root0/port2/endpoint4/decoder4.0/ /sys/bus/cxl/devices/endpoint4 points to ../../../../../pci0000:0c/0000:0c:00.0/0000:0d:00.0/mem1/ By the way, cat /sys/bus/cxl/devices/decoder4.0/mode => "ram".

For CXL PMEM: /sys/bus/dax/devices/dax0.0 points to ../../../devices/platform/ACPI0017:00/root0/nvdimm-bridge0/ndbus0/region0/dax0.0/dax0.0/ (not sure nvdimm-bridge is CXL specific or not) /sys/bus/cxl/devices/region0 contains target0 (and target1... if interleave) contain decoder names just like in RAM case above (except mode contains "pmem" instead of "ram").

bgoglin commented 1 year ago

By the way, if we locate CXL devices behind DAX/NUMA from them, we could use their PCI NUMA locality in case the DAX locality isn't correctly set (it's the case for CXL RAM device in Qemu+Linux 6.2 right now).