OpenMPDK / SMDK

SMDK, Scalable Memory Development Kit, is developed for Samsung CXL(Compute Express Link) Memory Expander to enable full-stack Software-Defined Memory system
271 stars 60 forks source link

Zone ExMem: How to slove the problem about "ExMem empty" ? #26

Closed Taeung closed 1 year ago

Taeung commented 1 year ago

Hi, First of all thank you for this opensource! :smile:

How to slove the problem of [ 0.091860] ExMem empty ? I appreciate it, if you help me! :smiley:

$ dmesg
...
[    0.088526] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[    0.088542] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x107fffffff]
[    0.088555] ACPI: SRAT: Node 1 PXM 1 [mem 0x1080000000-0x207fffffff]
[    0.088707] NUMA: Initialized distance table, cnt=2
[    0.088738] NUMA: Node 0 [mem 0x00000000-0x7fffffff] + [mem 0x100000000-0x107fffffff] -> [m
em 0x00000000-0x107fffffff]
[    0.088865] NODE_DATA(0) allocated [mem 0x107ffc1000-0x107fffffff]
[    0.088996] NODE_DATA(1) allocated [mem 0x204ff9e000-0x204ffdcfff]
[    0.091805] Zone ranges:
[    0.091811]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.091831]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.091847]   Normal   [mem 0x0000000100000000-0x000000207fffffff]
[    0.091860]   ExMem    empty
[    0.091869]   Device   empty
[    0.091877] Movable zone start for each node
[    0.091891] Early memory node ranges
[    0.091896]   node   0: [mem 0x0000000000001000-0x000000000003dfff]
[    0.091910]   node   0: [mem 0x000000000003f000-0x000000000009ffff]
[    0.091919]   node   0: [mem 0x0000000000100000-0x000000006d90efff]
[    0.091929]   node   0: [mem 0x00000000777ff000-0x00000000777fffff]
[    0.091938]   node   0: [mem 0x0000000100000000-0x000000107fffffff]
[    0.091992]   node   1: [mem 0x0000001080000000-0x000000207fffffff]
[    0.092050] Initmem setup node 0 [mem 0x0000000000001000-0x000000107fffffff]
[    0.092074] Initmem setup node 1 [mem 0x0000001080000000-0x000000207fffffff]
[    0.092093] On node 0, zone DMA: 1 pages in unavailable ranges
[    0.092110] On node 0, zone DMA: 1 pages in unavailable ranges
[    0.092239] On node 0, zone DMA: 96 pages in unavailable ranges
[    0.103132] On node 0, zone DMA32: 40688 pages in unavailable ranges
[    0.245226] On node 0, zone Normal: 2048 pages in unavailable ranges
...
[    6.773184] cxlswap using pool (allocator: cxlbud)
[    6.773734] cxlcache using pool (allocator: cxlbud)
...
[    7.466205] cxl_pci 0000:3d:00.0: enabling device (0140 -> 0142)
[    7.469079] CXL: register cxl dvsec ranges
[    7.471867] CXL: node count: 0
[    7.474721] cxl_pci 0000:3d:00.0: No component registers (-19)
[    7.477450] cxl_pci 0000:3d:00.0: Failed to request region 0x0000000000001fff-0x000000000010201e
...
[   13.062696] cxl_mem mem0: CXL port topology not found
...

My system info:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.2 LTS
Release:    22.04
Codename:   jammy

$ uname -r
6.4.0-smdk

$ cat /proc/buddyinfo 
Node 0, zone      DMA      2      2      1      0      0      0      0      0      1      1      2 
Node 0, zone    DMA32      6      6      6      5      4      4      3      5      6      2    413 
Node 0, zone   Normal    403     37     51     53     73     71     32      3     45     16  13110 
Node 1, zone   Normal     28   1896   1150    424    191     55      7      3      3      2  13893 

$ lsmod | grep cxl
cxl_mem                12288  0
cxl_port               16384  0
cxl_acpi               24576  0
cxl_pci               143360  1 cxl_acpi
cxl_core              229376  4 cxl_port,cxl_mem,cxl_pci,cxl_acpi

$ cd /lib/modules/6.4.0-smdk/kernel/drivers/cxl
$ tree
.
├── core
│   └── cxl_core.ko
├── cxl_acpi.ko
├── cxl_mem.ko
├── cxl_pci.ko
├── cxl_pmem.ko
└── cxl_port.ko

CXL Device info:

$ lspci -vvv
...
3d:00.0 CXL: Device 1b00:c000 (rev 01) (prog-if 10 [CXL Memory Device (CXL 2.x)])
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        NUMA node: 0
        Region 0: Memory at 20e000000000 (64-bit, prefetchable) [size=16M]
        Region 2: Memory at 20c000000000 (64-bit, prefetchable) [size=128G]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] Express (v2) Root Complex Integrated Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0
                        ExtTag- RBE+ FLReset+
                DevCtl: CorrErr- NonFatalErr- FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp+ 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
        Capabilities: [b0] MSI-X: Enable+ Count=7 Masked-
                Vector table: BAR=0 offset=00010000
                PBA: BAR=0 offset=00011000
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC+ UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [148 v1] Device Serial Number 8a-0a-f7-67-60-30-00-00
        Capabilities: [374 v1] Extended Capability ID 0x2f
        Capabilities: [3c0 v1] Designated Vendor-Specific: Vendor=1e98 ID=0000 Rev=1 Len=56: CXL
                CXLCap: Cache- IO+ Mem+ Mem HW Init+ HDMCount 1 Viral+
                CXLCtl: Cache- IO+ Mem- Cache SF Cov 0 Cache SF Gran 0 Cache Clean- Viral-
                CXLSta: Viral-
        Capabilities: [3f8 v1] Designated Vendor-Specific: Vendor=1e98 ID=000a Rev=0 Len=28 <?>
        Capabilities: [d04 v1] Designated Vendor-Specific: Vendor=1e98 ID=0008 Rev=0 Len=28 <?>
        Capabilities: [d48 v1] Extended Capability ID 0x2e
        Capabilities: [d60 v1] Designated Vendor-Specific: Vendor=8086 ID=0050 Rev=0 Len=12 <?>
        Capabilities: [d6c v1] Designated Vendor-Specific: Vendor=1e98 ID=0005 Rev=0 Len=16 <?>
        Capabilities: [d80 v1] Extended Capability ID 0x2e
        Kernel driver in use: cxl_pci
        Kernel modules: cxl_pci
Taeung commented 1 year ago

@KyungsanKim if you help me, I appreciate it !

wj28lees commented 1 year ago

Hi, thanks for the detailed log information :)

CXL memory is registered at the time of probing, so it is normal to see "ExMem empty" in the "Zone ranges" dmesg log.

Based on the dmesg log, it looks like the CXL memory range is not visible on the ACPI SRAT. For now, to use CXL memory, the BIOS must specify and provide the CXL memory range to the OS.

Check the "BIOS-provided physical RAM map" in dmesg for the CXL memory range. If there is no CXL range information, you may need to check if your system's BIOS supports CXL memory. (BTW, what system and CXL device are you currently using?)

Thanks.

Taeung commented 1 year ago

Thank you for your reply @wj28lees !

I'm using SAMSUNG CXL MEMORY EXPANDER(MXFAG1280B20-CWK00) and Supermicro(X13DEG-QT) + Ubuntu 22.04 server

sudo dmidecode -t 1
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.

Handle 0x0001, DMI type 1, 27 bytes
System Information
    Manufacturer: Supermicro
    Product Name: SYS-741GE-TNRT
    Version: 0123456789
    Serial Number: A...
    UUID: 00000000-0000-0000-0000-7cc2559d47fc
    Wake-up Type: Power Switch
    SKU Number: To be filled by O.E.M.
    Family: Family
sudo dmidecode -t 2
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.5.0 present.

Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
    Manufacturer: Supermicro
    Product Name: X13DEG-QT
    Version: 1.10
    Serial Number: OM...
    Asset Tag: Base Board Asset Tag
    Features:
        Board is a hosting board
        Board is replaceable
    Location In Chassis: Part Component
    Chassis Handle: 0x0003
    Type: Motherboard
    Contained Object Handles: 0
Taeung commented 1 year ago

What you told me was very helpful. So I rechecked my BIOS configuration manual PDF and I found "CXL Type 3 Legacy". I made it "Enable".

CXL Type 3 Legacy
Select Enable to use the CXL Type 3 memory device, which can be supported by the
CXL Type 2 flows, for memory bandwidth and capacity expansion. The options are Dis-
able and Enable.

And I found the CXL range information! But I've not found ExMem zone yet. How can I solve it ?

$ dmesg
...
[    0.000000] BIOS-e820: [mem 0x0000002080000000-0x000000407fffffff] soft reserved
[    0.000000] efi_fake_mem: add attr=0x0000000000040000 to [mem 0x0000002080000000-0x000000407fffffff]
[    0.000000] e820: remove [mem 0x2080000000-0x404fe25fff] usable
...
[    0.000000] reserve setup_data: [mem 0x0000002080000000-0x000000404fe25fff] soft reserved
[    0.000000] reserve setup_data: [mem 0x000000404fe26000-0x000000407fffffff] soft reserved
...
[    0.014957] efi: mem668: [Conventional|   |  |  |SP|  |  |  |  |  |   |WB|WT|WC|UC] range=[
0x0000002080000000-0x000000404fe25fff] (130302MB)
[    0.014959] efi: mem669: [Loader Data |   |  |  |SP|  |  |  |  |  |   |WB|WT|WC|UC] range=[0x000000404fe26000-0x000000407fffffff] (769MB)
...
[    0.015600] RAMDISK: [mem 0x404fe26000-0x407fffffff]
...
[    0.256500] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[    0.256502] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x107fffffff]
[    0.256504] ACPI: SRAT: Node 1 PXM 1 [mem 0x1080000000-0x207fffffff]
[    0.256518] ACPI: SRAT: Node 2 PXM 2 [mem 0x2080000000-0x407fffffff]
...
[    0.256525] NUMA: Initialized distance table, cnt=3
[    0.256530] NUMA: Node 0 [mem 0x00000000-0x7fffffff] + [mem 0x100000000-0x107fffffff] -> [mem 0x00000000-0x107fffffff]
[    0.256545] NODE_DATA(0) allocated [mem 0x107ffc1000-0x107fffffff]
[    0.256580] NODE_DATA(1) allocated [mem 0x204fde6000-0x204fe24fff]
...
[    0.257081] Zone ranges:
[    0.257082]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.257083]   DMA32    [mem 0x0000000001000000-0x00000000ffffffff]
[    0.257085]   Normal   [mem 0x0000000100000000-0x000000207fffffff]
[    0.257086]   ExMem    empty
[    0.257087]   Device   empty
...
[    0.257090] Early memory node ranges
[    0.257091]   node   0: [mem 0x0000000000001000-0x000000000003dfff]
[    0.257093]   node   0: [mem 0x000000000003f000-0x000000000009ffff]
[    0.257094]   node   0: [mem 0x0000000000100000-0x000000006d90efff]
[    0.257095]   node   0: [mem 0x00000000777ff000-0x00000000777fffff]
[    0.257096]   node   0: [mem 0x0000000100000000-0x000000107fffffff]
[    0.257102]   node   1: [mem 0x0000001080000000-0x000000207fffffff]
[    0.257109] Initmem setup node 0 [mem 0x0000000000001000-0x000000107fffffff]
[    0.257113] Initmem setup node 1 [mem 0x0000001080000000-0x000000207fffffff]
[    0.257115] Initializing node 2 as memoryless
[    0.257158] Initmem setup node 2 as memoryless
...
[    1.645706] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.4.0-smdk root=UUID=b7248bf1-0cc5-4dca-ac40-872158720c6a ro efi_fake_mem=128G@0x2080000000:0x40000
...

Related to CXL :

$ dmesg
[    8.508837] cxlswap using pool (allocator: cxlbud)
[    8.509402] cxlcache using pool (allocator: cxlbud)
...
[    9.195644] CXL: soft reserved: [mem 0x2080000000-0x404fe3efff]
[    9.198456] CXL: out of range: [mem 0x2080000000-0x407fffffff]
[    9.201258] CXL: failed to add cxl meminfo
[    9.204045] CXL: node count: 0
[    9.206853] cxl_pci 0000:3d:00.0: enabling device (0140 -> 0142)
[    9.209611] CXL: register cxl dvsec ranges
[    9.212326] CXL: soft reserved: [mem 0x2080000000-0x404fe3efff]
[    9.212374] AVX2 version of gcm_enc/dec engaged.
[    9.215074] CXL: out of range: [mem 0x2080000000-0x407fffffff]
[    9.215075] CXL: Failed to add cxl meminfo
[    9.215075] CXL: node count: 0
[    9.215200] cxl_pci 0000:3d:00.0: No component registers (-19)
[    9.228452] cxl_pci 0000:3d:00.0: Failed to request region 0x0000000000001fff-0x000000000010201e
[    9.231212] AES CTR mode by8 optimization enabled
[    9.232498] can't find cxl_memblk.
...
[   14.777261] cxl_mem mem0: CXL port topology not found
$ cat /proc/buddyinfo 
Node 0, zone      DMA      2      2      1      0      0      0      0      0      1      1      2 
Node 0, zone    DMA32      8      7      7      6      7      6      5      5      7      3    412 
Node 0, zone   Normal    461    319    123    284    667    733    351    215    111     52  15268 
Node 1, zone   Normal  41094  18027   2637   1534   1015    647    289     91     50     38  13786 
wj28lees commented 1 year ago

Taeung, I'm glad to hear you found this helpful!

Could you please reach out to us at the email addresses below? : wj28.lee@samsung.com, ks0204.kim@samsung.com

It would be convenient to analyse this further via email, including requesting more information to understand what is happening. (It would also be helpful if you could provide us with information about your research organisation or company.)

Thanks!

Taeung commented 1 year ago

OK I'll do it as you said.

wj28lees commented 1 year ago

This issue has been resolved so I'll close it. Thanks!