pmem / ndctl

A "device memory" enabling project encompassing tools and libraries for CXL, NVDIMMs, DAX, memory tiering and other platform memory device topics.
Other
268 stars 138 forks source link

nvdimm mode change after power reset #145

Open wm8120 opened 4 years ago

wm8120 commented 4 years ago

We use ndctl to set nvdimm in devdax mode. The command we used is:

ndctl create-namespace --mode devdax --map dev -e namespace0.0 -f

Before power reset, the ndctl list returns:

$ sudo ndctl list -N
[
    {
        "dev":"namespace0.0",
        "mode":"devdax",
        "map":"dev",
        "size":16909336576,
        "chardev":"dax0.0",
        "align":2097152
    }
]

Then, we run traffics and do power reset. After reboot, the nvdimm mode changed:

$ sudo ndctl list -N
[
    {
        "dev":"namespace0.0",
        "mode":"raw",
        "size":17179869184,
        "sector_size":"512",
        "blockdev":"pmem0"
    }
]

The application cannot open nvdimm in devdax mode and retrieve the saved data. This issue doesn't happen in every reboot. But every time it happens, we have to recreate the name space in devdax mode and lost the saved data. What's the reason for the mode change? And is there any wrong on the usage of ndctl and nvdimm?

OS: ubuntu 16.04 Kernel: 4.20.8 ndctl version: 65+

sscargal commented 4 years ago

@wm8120 Does this system have physical PMem installed or are you emulating it with the kernel memmap option or kvm/qemu? If you're emulating NVDIMMs with the memmap option (ie: using DRAM as the backing store), then I would expect the observed behaviour.

wm8120 commented 4 years ago

@sscargal It's a battery backed NVDIMM. We expect it persists data upon every reset. But the ndctl mode changes sometimes after reset.

sscargal commented 4 years ago

It sounds like the issue is probably on the NVDIMM-N hardware side than the Kernel. Make sure the BIOS and NVDIMM firmware are up to date. Also, try to experiment with a newer Kernel and/or OS version. I highly doubt either the Kernel or newer distro will resolve the problem, but it's worth verifying on the most recent software 'just in case'.

Q) How exactly do you perform the reset and power-cycle? Are you issuing systemctl reboot or yanking the power?

The NVDIMMs should be receiving a SAVE instruction/signal when power is lost. This tells the modules to begin copying the data from the volatile DDR to the non-volatile media. If the dimm(s) do not complete the copy process before their energy reserves are exhausted, you'll end up with data corruption (similar to what you describe). If there are any diagnostic logs that can be retrieved from the modules, hopefully, they can be used to debug further and confirm if the copy to/from non-volatile memory is successful or not. I only know what's possible on the Intel Optane Persistent Memory side.

wm8120 commented 4 years ago

It's a JEDEC NVDIMM-N compatible DIMM. We test with yanking the power. Is there a method to dump register values by ndctl, so we can ask the vendor for more information? There may be some vender specific registers ndctl does not know, but I think ndctl could access registers defined by JEDEC standard.

sscargal commented 4 years ago

The ndctl utility is intended to be vendor neutral. It conforms to the SNIA Persistent Memory Programming Model. You'll need to ask your NVDIMM-N vendor for a product specific tool to gather detailed information you're looking for.

If the issue can only be reproduced when removing external power vs rebooting or scheduled power off/on, this would learn more towards the lack of stored energy to perform the copy issue I described. If the NVDIMMs have direct-attached supercaps or batteries, perhaps one or more of these has reached the end of its working life. If you're drawing power through the dimm slot pins to a single power source, again, perhaps this unit is reaching the end of its usable life.

It's not easy to dump registers in the power-loss scenario as the system usually stops processing. This scenario typically requires dedicated diagnostic equipment.

If you're using the HPe NVDIMM product on a HPe server, check the IML for any reported issues. HPe has Firmware available for the 16GB modules.