intel / ipmctl

BSD 3-Clause "New" or "Revised" License
184 stars 62 forks source link

Use PMem in different modes in different sockets #170

Closed dkoutsou closed 3 years ago

dkoutsou commented 3 years ago

Hi, I have 512GB of Intel Optane Persistent memory divided in two sockets:

$ ipmctl show -region
 SocketID | ISetID             | PersistentMemoryType | Capacity    | FreeCapacity | HealthState
=================================================================================================
 0x0000   | 0x7002eeb8a8852444 | AppDirect            | 252.000 GiB | 252.000 GiB  | Healthy
 0x0001   | 0xde02eeb865852444 | AppDirect            | 252.000 GiB | 252.000 GiB  | Healthy

This is the system configuration:

$ ipmctl show -memoryresources
 MemoryType   | DDR                 | PMemModule  | Total
========================================================================
 Volatile     | 128.000 GiB         | 0.000 GiB   | 128.000 GiB
 AppDirect    | -                   | 504.000 GiB | 504.000 GiB
 Cache        | 0.000 GiB           | -           | 0.000 GiB
 Inaccessible | 17179869056.000 GiB | 1.689 GiB   | 17179869057.689 GiB
 Physical     | 0.000 GiB           | 505.689 GiB | 505.689 GiB

I want to set the memory in the first socket as volatile memory and the one in the other socket in direct mode. To do that I use the following commands:

$ ndctl disable-namespace namespace1.0
$ ndctl destroy-namespace namespace1.0
$ ipmctl create -socket 1 -goal MemoryMode=100
$ shutdown -r now

However afterwards the server does not reboot and if it does the memory is shown as inaccessible:

$ ipmctl show -memoryresources
 MemoryType   | DDR                 | PMemModule  | Total
================================================================
 Volatile     | 128.000 GiB         | 0.000 GiB   | 128.000 GiB
 AppDirect    | -                   | 252.000 GiB | 252.000 GiB
 Cache        | 0.000 GiB           | -           | 0.000 GiB
 Inaccessible | 17179869056.000 GiB | 253.689 GiB | 125.689 GiB
 Physical     | 0.000 GiB           | 505.689 GiB | 505.689 GiB

For the boot error see the screenshot attached: 20210513-001231

My question is: Is it possible to use the memory from one socket as volatile memory and the memory from the other socket in app direct mode? Is the error a limitation of my server configuration, of ipmcl or something else?

Thanks!

sscargal commented 3 years ago

Hi @dkoutsou,

I want to set the memory in the first socket as volatile memory and the one in the other socket in direct mode.

That is not recommended. HPe supports ratios from 2:1 - 16:1 (PMem:DRAM), so you will be okay using 100% Memory Mode, and Mixed Mode needs to be configured with this ratio's in mind.

In mixed mode, 100% of the DRAM across all CPU sockets is used for the Cache and will not be visible to the host, regardless of the percentage value used, or PMem modules, or CPU sockets you use as targets during the ipmctl create -goal request.

AppDirect, Memory Mode, or Mixed Mode, should be applied across all CPU sockets in a symmetric manner (the default). This allows you to use all CPU sockets for any workload. If you provision an asymmetric configuration, apps that need volatile memory that run on the CPU with only AppDirect are forced to go over the UPI which introduces additional latency, thus reducing performance. And vice-versa for any app that needs AppDirect/Persistence.

Is the error a limitation of my server configuration, of ipmcl or something else?

Can you post the ipmctl show -topology please. It'll confirm how many DDR modules you have in the host and in which Socket. Or you can post a screenshot of the Memory section in the iDRAC which has the same info.

We first need to confirm the DDR and PMem are installed in the correct slots. HPe has a DDR and PMem population matrix for each server. Make sure you read and follow the guidelines for your server and the Intel Optane persistent memory 100 series for HPEUser Guide.

If you're using ipmctl v2.0.0.3809, then you'll need to perform a secure erase and provision PMem through the iDRAC or update ipmctl to a version greater than or equal to 2.0.0.3820. There was a bug in the 3809 release that can cause provisioning issues.

dkoutsou commented 3 years ago

Hi @sscargal,

Thanks a lot for your reply! Unfortunately, ipctl show -topology doesn't return anything. Here is a screenshot of the iDRAC: Screenshot 2021-05-20 at 11 06 24

The ipcml version is 2.00.00.3852.

sscargal commented 3 years ago

Thank you. The population is correct and the version of ipmctl is fine. The BIOS is enforcing a symmetric configuration across all CPU sockets, which leads us back to $ ipmctl create -socket 1 -goal MemoryMode=100 as the root cause for why you're getting the errors during POST.

To resolve the issue I recommend switching to 100% Memory Mode or 100% App Direct, or 50% Mixed Mode depending on your needs. You can do this using ipmctl or the iDRAC/BIOS by following the HPe instructions.

// 100% App Direct
ipmctl create -goal PersistentMemoryType=AppDirect

// 100% Memory Mode
ipmctl create -goal MemoryMode=100

// 50% Mixed Mode
ipmctl create -goal MemoryMode=50 PersistentMemoryType=AppDirect
dkoutsou commented 3 years ago

Thanks a lot for you reply @sscargal !