1. Overview
2. HW Description
3. Setup
4. Programming Guide
PIMSimulator is a cycle accurate model that Single Instruction, Multiple Data (SIMD) execution units that uses the bank-level parallelism in PIM Block to boost performance that would have otherwise used multiple times of bandwidth from simultaneous access of all bank. The simulator include memory and have embedded within it a PIM block, which consist of programmable command registers, general purpose register files and execution units.
Based on https://github.com/umd-memsys/DRAMSim2, the simulator includes
ini/HBM2_samsung_2M_16B_x64.ini
)PIM is a HBM stack that is pin compatible with HBM2 and have embedded within it a PIM block
|--------------| |--------------| |--------------|
| | (A) | | (B) | |
| HOST |----------| Controller |----------| Memory |
| | | | | |
|--------------| |--------------| |--------------|
|<-rank->|<-row->|<-col high->|<-bg->|<-bank->|<-chan->|<-col low->|<-offset ->|
// Static Setting in system_*.ini
ADDRESS_MAPPING_SCHEME=Scheme8
|--------| |--------|
| | | |
| BANK_0 | | BANK_2 |
| | | |
|--------| |--------|
| PB_0 | | PB_1 |
|--------| |--------|
| | | |
| BANK_1 | | BANK_3 |
| | | |
|--------| |--------|
if 2 * NUM_PIM_BLOCKS == NUM_BANK, a PIM block is located per two banks.
if NUM_PIM_BLOCKS == NUM_BANKS, a PIM Block (PB) is located per banks.
Type | Command | Description | Result (DST) | Operand (SRC0) | Operand (SRC1) |
---|---|---|---|---|---|
Arithmetic | ADD | addition | GRF | GRF, BANK, SRF | GRF, BANK, SRF |
Arithmetic | MUL | multiplication | GRF | GRF, BANK | GRF, BANK, SRF |
Arithmetic | MAC | multiply-accumulate | GRF_B | GRF, BANK | GRF, BANK, SRF |
Arithmetic | MAD | multiply-and-add | GRF | GRF, BANK | GRF, BANK, SRF |
Data | MOV | load or store data from register to bank | GRF, SRF | GRF, BANK | |
Data | FILL | copy data from bank to register | GRF, BANK | GRF, BANK | |
Control | NOP | do nothing | |||
Control | JUMP | jump instruction | |||
Control | EXIT | exit instruction |
Mode | Transaction | PIM Instruction | Operation |
---|---|---|---|
SB | Read | - | Normal Memory Read |
SB | Write | - | Normal Memory Write |
HAB | Write | - | PIM Write (Host to PIM Register) |
PIM | - | MOV | read or write from bank to PIM Register |
PIM | - | FILL | write from bank to PIM Registers |
Scons
tool for compiling PIMSimulator:
sudo apt install scons
gtest
for running test cases:
sudo apt install libgtest-dev
# compile
scons
./sim --gtest_list_tests
PIMKernelFixture. gemv_tree gemv mul add relu MemBandwidthFixture. hbm_read_bandwidth hbm_write_bandwidth PIMBenchFixture. gemv mul add relu
* Test Running
```bash
# Running: functionality test (GEMV)
./sim --gtest_filter=PIMKernelFixture.gemv
# Running: functionality test (MUL)
./sim --gtest_filter=PIMKernelFixture.mul
# Running: performance test (GEMV)
./sim --gtest_filter=PIMBenchFixture.gemv
# Running: performance test (ADD)
./sim --gtest_filter=PIMBenchFixture.add
If you want to functionality test for other dimensions, generate a new dimension in ./data
and add generated dimension to the source of src/tests/KernelTestCases.cpp
.
Use the gen script in ./data
to generate data of the dimension to be changed.
# build to No-data mode
scons NO_STORAGE=1
Highly recommend you to refer to src/tests/*
(especially, src/tests/PIMKernel.cpp
and src/tests/PIMBenchTestCases.cpp
)
To attach to host simulator, refer to src/tests/PIMKernel.cpp
.
You can see commands that request memory transactions to the memory controller for GEMV or Eltwise operations on PIM.
It include a basic PIM procedure for GEMV operation in the PIMKernel::executeGemv()
,
and also for Eltwise operation (add, mul, relu) in the PIMKernel::executeEltwise()
mem->addTransaction(is_read, address, tag, buffer);
addTransaction(is_read, address, buffer)
BurstType nullBst
mem->addTransaction(isWrite, addr, &nullBst);
mem->addTransaction(false, addr, tag, buffer);
mem->addTransaction(true, addr, tag, buffer);
Here, the buffer must be at least 256bit size container.
alu_pim (dataflow is similar to normal write)
mem->addTransaction(true, addr, tag, buffer);
src/tests/PIMCmdGen.h
) and procedures using them(src/tests/PIMKernel.cpp
)read_pim (dataflow is similar to normal read)
mem->addTransaction(false, addr, tag, buffer);
The following shows the high level steps of a generic PIM operation.
A similar procedure at the source level can be found in src/tests/PIMKernel.cpp
.
/* Example Code - PIMKernel::executeELtwise() */
parkIn();
changePIMMode(dramMode::SB, dramMode::HAB); // Switch to HAB
programCrf(pim_cmds); // Program CRF
changePIMMode(dramMode::HAB, dramMode::HAB_PIM); // Enable PIM
if (ktype == KernelType::ADD || ktype == KernelType::MUL)
computeAddOrMul(num_tile, input0_row, result_row, input1_row); // Execute PIM
else if (ktype == KernelType::RELU)
computeRelu(num_tile, input0_row, result_row);
changePIMMode(dramMode::HAB_PIM, dramMode::HAB); // Disable PIM
changePIMMode(dramMode::HAB, dramMode::SB); // Switch to SB mode
parkOut();
* The other basic operation flow on PIM for GEMV(Matrix Vector multiplication), Element-wise operation are described in the `src/tests/PIMKernel.cpp`.
### Contact
* Shin-haeng Kang (s-h.kang@samsung.com)
* Sanghoon Cha (s.h.cha@samsung.com)
* Seungwoo Seo (sgwoo.seo@samsung.com)
* Jin-seong kim (jseong82.kim@samsung.com)