open-mpi / hwloc

Hardware locality (hwloc)
https://www.open-mpi.org/projects/hwloc
Other
577 stars 174 forks source link

use SRAT and dmi-sysfs to find locality of MemoryModule #139

Open bgoglin opened 9 years ago

bgoglin commented 9 years ago

The SRAT table gives the locality of memory ranges. It's accessible to root (just like dmi-sysfs used for building MemoryModule Misc objects) in /sys/firmware/acpi/tables/SRAT. Those ranges may be linked to MemoryModule using the "physical device handle".

Handle 0x0041, DMI type 17, 34 bytes
Memory Device
    Array Handle: 0x0039
    Error Information Handle: Not Provided
    Total Width: 72 bits
    Data Width: 64 bits
    Size: 8192 MB
    Form Factor: DIMM
    Set: None
    Locator: P2-DimmH1
    Bank Locator: P2-CH-H
    Type: DDR3
    Type Detail: Registered (Buffered)
    Speed: 1600 MHz
    Manufacturer: Hynix Semiconductor
    Serial Number: 30C08EC6     
    Asset Tag: DimmH1_AssetTag
    Part Number: HMT31GR7CFR4C-PB 
    Rank: 1
    Configured Clock Speed: 1600 MHz

Handle 0x0042, DMI type 20, 35 bytes
Memory Device Mapped Address
    Starting Address: 0x00E00000000
    Ending Address: 0x00FFFFFFFFF
    Range Size: 8 GB
    Physical Device Handle: 0x0041
    Memory Array Mapped Address Handle: 0x003A
    Partition Row Position: Unknown
    Interleave Position: Unknown
    Interleaved Data Depth: Unknown

It's not clear if the memory array mapped address handle could help in some cases too.

Unfortunately, on some machines, there's an offset in the SRAT ranges. BIOS-e820 lines in dmesg confirm some memory is inserted in the middle of the DIMMs mapping in physical memory. And sometimes it's even inserted in the middle of a single DIMM.

bgoglin commented 9 years ago

Example of dual E5v1 machine with 64GB of RAM.

DMI-sysfs information sees 8x 8GB DIMMs from 0x00000000000 to 0x00FFFFFFFFF:

$ sudo dmidecode | grep -B 3 "8 GB"
Memory Device Mapped Address
    Starting Address: 0x00000000000
    Ending Address: 0x001FFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00200000000
    Ending Address: 0x003FFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00400000000
    Ending Address: 0x005FFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00600000000
    Ending Address: 0x007FFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00800000000
    Ending Address: 0x009FFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00A00000000
    Ending Address: 0x00BFFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00C00000000
    Ending Address: 0x00DFFFFFFFF
    Range Size: 8 GB
--
Memory Device Mapped Address
    Starting Address: 0x00E00000000
    Ending Address: 0x00FFFFFFFFF
    Range Size: 8 GB

SRAT information has a hole from 0x0080000000 to 0x0100000000 (from 2GB to 4GB):

memory from 0x0000000000 to 0x0080000000 on NUMA node #0
memory from 0x0100000000 to 0x0880000000 on NUMA node #0
memory from 0x0880000000 to 0x1080000000 on NUMA node #1

BIOS e820 information in dmesg:

[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009b400 (usable)
[    0.000000]  BIOS-e820: 000000000009b400 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000007e443000 (usable)
[    0.000000]  BIOS-e820: 000000007e443000 - 000000007e56a000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000007e56a000 - 000000007f207000 (reserved)
[    0.000000]  BIOS-e820: 000000007f207000 - 000000007f27c000 (ACPI data)
[    0.000000]  BIOS-e820: 000000007f27c000 - 000000007f317000 (reserved)
[    0.000000]  BIOS-e820: 000000007f317000 - 000000007f318000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000007f318000 - 000000007f339000 (reserved)
[    0.000000]  BIOS-e820: 000000007f339000 - 000000007f341000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000007f341000 - 000000007f36a000 (reserved)
[    0.000000]  BIOS-e820: 000000007f36a000 - 000000007f800000 (ACPI NVS)
[    0.000000]  BIOS-e820: 0000000080000000 - 0000000090000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fed1c000 - 00000000fed40000 (reserved)
[    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000001080000000 (usable)

The hole corresponds to 7 reserved lines at the end.

xiongzubiao commented 1 year ago

Strongly interested in this feature! It would be really nice to show where the memory modules belong to.

sscargal commented 1 year ago

+1 Upvote. Having the MemoryModule objects correctly appear under the Package# would be very useful indeed.

bgoglin commented 1 year ago

Unfortunately, I still don't know any reliable way to implement this :/ If some CPU vendors can help, that'd be great.