An Open-Source Posit Dot-Product Unit (PDPU) for Deep Learning Applications
Authors: Qiong Li, Chao Fang, and Zhongfeng Wang @ Nanjing University
[ English | 简体中文 ]
The proposed PDPU performs a dot-product of two input vectors $V_a$ and $V_b$ in low-precision format, and then accumulates the dot-product result and previous output $acc$ to a high-precision value $out$ as shown below: $$out = acc+V_a\times V_b = acc+a_0\cdot b_0+a_1\cdot b1+...+a{N-1}\cdot b_{N-1}$$
It introduces the following contributions and features:
The architeture of PDPU equipped with a fined-grained 6-stage pipeline is depicted as follows:
The dataflow at each pipeline stage is as follows:
The PDPU is implemented using SystemVerilog, and the module hierarchy is as follows:
pdpu_top.sv # top module, combinationally implemented
pdpu_top_pipelined.sv # PDPU equipped with a fine-grained 6-stage pipeline
├── registers.svh # register header file
├── pdpu_pkg.sv # package, packaging common functions, etc.
├── posit_decoder.sv # posit decoder, extracting valid components of posit inputs
│ ├── pdpu_pkg.sv
│ ├── lzc.sv # leading zero count
│ └── cf_math_pkg.sv
│ └── barrel_shifter.sv # barrel shifter
├── radix4_booth_multiplier.sv # modified radix-4 booth wallace multiplier
│ ├── gen_prods.sv # generate partial products
│ └── gen_product.sv # generate a partial product according to booth encoding result
│ └── booth_encoder.sv # radix-4 booth encoder
│ └── csa_tree.sv # recursive carry-save-adder (CSA) tree
│ ├── compressor_3to2.sv # 3:2 compressor
│ └── fulladder.sv # full adder
│ └── compressor_4to2.sv # 4:2 compressor
│ └── counter_5to3.sv # 5:3 counter
├── comp_tree.sv # recursive comparator tree
│ └── comparator.sv # Comparator between two signed numbers
├── barrel_shifter.sv
├── csa_tree.sv
│ ├── compressor_3to2.sv
│ └── fulladder.sv
│ └── compressor_4to2.sv
│ └── counter5to3.sv
├── mantissa_norm.sv # mantissa normalization
│ ├── lzc.sv
│ └── cf_math_pkg.sv
│ └── barrel_shifter.sv
├── posit_encoder.sv # posit encoder, packing result components into posit output
│ └── pdpu_pkg.sv
└── └── barrel_shifter.sv
Benefitting from the highly parameterized sub-modules, PDPU can be configured from several aspects, i.e., posit formats, dot-product size, and alignment width.
If you find PDPU helpful in your work, please cite us:
@inproceedings{li2023pdpu,
title={PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications},
author={Li, Qiong and Fang, Chao and Wang, Zhongfeng},
booktitle={2023 IEEE International Symposium on Circuits and Systems (ISCAS)},
year={2023},
organization={IEEE}
}