CGCL-codes / HME

HME a hybrid memory emulator for studying the performance and energy characteristics of upcoming NVM technologies. HME exploits features available in commodity NUMA architectures to emulate two kinds of memories: fast, local DRAM, and slower, remote NVM on other NUMA nodes. HME can emulates a wide range of NVM latencies and bandwidth by injecting different memory access delay on the remote NUMA nodes. To facilitate programmers and researchers in evaluating the impact of NVM on the application performance, a high-level programming interface is also provided to allocate memory from NVM or DRAM nodes.
49 stars 18 forks source link

HME:A Lightweight Emulator for Hybrid Memory

        HME is a DRAM-based performance emulator to emulate the performance and energy characteristics of upcoming NVM technologies. HME exploits features available in commodity NUMA architectures to emulate two kinds of memories: fast, local DRAM, and slower, remote NVM on other NUMA nodes. HME can emulates a wide range of NVM latencies and bandwidth by injecting different memory access delays on the remote NUMA nodes. To help programmers and researchers in evaluating the impact of NVM on the application performance, we also provide a high-level programming interface to allocate memory from NVM or DRAM pools - AHME.

HME has achieved following functions:

Citing HME

If you use HME, please cite our research paper published at DATE 2018, included as ./HME.pdf. (LINK)

Zhuohui Duan, Haikun Liu, Xiaofei Liao, Hai Jin, HME: A Lightweight Emulator for Hybrid Memory, in: Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE'18), Dresden, Germany, March 19-23, 2018

@inproceedings{duan2018hme,
  title={{HME: A Lightweight Emulator for Hybrid Memory}},
  author={Duan, Zhuohui and Liu, Haikun and Liao, Xiaofei and Jin, Hai},
  booktitle={Proceedings of the 2018 Design, Automation \& Test in Europe Conference \& Exhibition (DATE)},
  pages={1375--1380},
  year={2018},
  organization={IEEE}
}

Acknowledgements

A portion of source code in HME is based on a prototype developed by Mingyu Chen and Dejun Jiang, et al. at Institute of Computing Technology, Chinese Academy of Sciences. In their work, they have emulated the NVM read latency and bandwidth using mechanisms similar to Quartz. We go further and make effort to emulate NVM write latency. We appreciate their previous contribution and valuable advices for this work.

HME Setup,Compiling,Configuration and How to use

1.External Dependencies
        Before install hybrid simulator HME, it's essential that you have already install dependencies listing below.

You can run 'sudo /scripts/install.sh' in order to automatically install some of these dependencies.

2.Compiling

Build Status

First, Compiling the emulator's module. From the emulator's source code ./HME folder, execute make.

[root @node1 HME]# cd HME
[root @node1 HME]# make  //to compiling the HME

3.Mult_core version

Make sure your PMU-TOOL is working without error.

Considering that PMU-TOOL must use ocperf as the base component, we have a multicore version of the simulator as a separate branch, using the linux4.12 kernel that supports ocperf.

Check the Linux system ./cache/pmu-events/ directory to see if the folder is empty, if it is empty, the pmu-events file is missing. You can get these files from https://download.01.org/ or offline version in ./Multcore_version/ which are: Genuine Intel-6-3F-core.json, GenuineIntel-6-3F-offcore.json, GenuineIntel-6-3F-uncore.json, HaswellX_core_V17.json, HaswellX_matrix_V17.json, HaswellX_uncore_V17.json, mapfile.csv

Different operating systems may correspond to different files.

Modify the perf_event_paranoid file to make PMU-TOOL work. Execute the

sudo sh -c 'echo -1 > / proc / sys / kernel / perf_event_paranoid'

to modify the perf_event_paranoid file. Need to pay special attention to is: perf_event_paranoid the value of the file will be reset to 2 after the machine is restarted, need to modify the file before normal execution.

/Multcore_version/HME/run.sh is used to start this tool.

/Multcore_version/HME/core_NVM.c is used to realize the driver core_NVM.ko which is used to receive the performance delta of every core and send these information to every core

/Multcore_version/HME/delay_count.py is used to calculate the performance delta of every core, you can change the delay of NVM in this file

/Multcore_version/HME/NVM_emulate_bandwidth is used to control the nvm bandwith, it same as regular_version we describe above.

How to use: Run the run.sh script in a shell to start the simulator. Create a new shell to run the program you need to test, then the program will run on the simulation environment.

AHME:PROGRAMMING INTERFACE

        We describe the programming interfaces of HME, named AHME, which can help the programmers to use HME more conveniently. We extend the Glibc library to provide the nvm malloc function so that the application can allocate hybrid memories through malloc or nvm malloc. In order to implement the nvm malloc function, we modify Linux kernel to provide a branch for nvm mmap memory allocation. The branch handles the procedure alloc page in the nvm malloc function. Meanwhile, the NVM pages are differentiated from the DRAM pages in the original VMA (virtual address space). AHME marks the NVM flag in VMA, and calls the specified do nvm page fault function on a page fault to allocate the physical address space in the NUMA remote node. When nvm mmap() from the extended Glibc is called, the kernel calls the do mmap() and do mmap pgoff() functions and flags NVM VMA on the VMA structure when the do mmap pgoff() function applies for the VMA. When a NVM page is accessed at the first time, it generates a page fault and the kernel call handle mm fault() function to handle the page fault. If the NVM VMA flag is matched, the do nvm numa() function is used to allocate a physical page, which is allocated by alloc page() to DRAM on HME remote node (NVM). If the page fault do not refer to a NVM page, the kernel uses normal do page() to allocate DRAM. We implement nvm malloc function in Glibc library by referring to the malloc function, and it calls nvm mmap function through a new syscall. The nvm malloc() calls the nvm mmap() function to pass the mmap parameters to the kernel through MAP NVM parameter provided by AHME kernel. After that, the above NVM allocation is performedby the AHME kernel.

AHME Setup,Compiling,Configuration and How to use

1.AHME kernel Compiling and install

From the emulator's source code */AHME/kernel,

[root @node1 kernel]# cp config .config                            //To config the configration of linux kernel
[root @node1 kernel]# sh -c 'yes "" | make oldconfig'               //Use the old kernel configuration and automatically accept the default settings for each new option
[root @node1 kernel]# sudo make -j20 bzImage
[root @node1 kernel]# sudo make -j20 modules
[root @node1 kernel]# sudo make -j20 modules_install
[root @node1 kernel]# sudo make install

You can run 'sudo */AHME/kernel/bulid.sh' in order to automatically install

2.AHME Glibc Compiling and install

3.How to use AHME You can use this way to alloc nvm memory if you need.

#include <malloc.h>
p = nvm_malloc(1024*8);

Then you must run this in type 0 (./HME/scripts/nvmini.in)

Limitation

For all of the above problems, we will solve them in the future work. If we can't solve it, we will try to reduce the impact of these defects. If you have good ideas or methods, please contact us.

Support or Contact

If you have any questions, please contact ZhuoHui Duan(zhduan@hust.edu.cn), Haikun Liu (hkliu@hust.edu.cn) and Xiaofei Liao (xfliao@hust.edu.cn).