Xenooooooooo / optimized_uBlock_algorithm

10 stars 0 forks source link

optimized_uBlock_algorithm

Introduction

uBlock is a family of block ciphers supporting 128-bit and 256-bit block sizes and key sizes. There are different versions of uBlock, which are denoted as uBlock-128/128, uBlock-128/256, and uBlock-256/256, respectively. The balance between security, implementation performance, and adaptability is reflected in the overall algorithm design, S-box, diffusion matrix, key schedule, and other details. The designing detail can be referred to the original paper.

This work combines the use of the AVX2 instruction set, optimized data storage structure, high-level parallelism and other methods to optimize the software implementation of uBlock algorithm. Meanwhile, the implementation of single-key version and multi-key version is given for the above three modes respectively for flexible use in various scenarios. The paper of our optimized uBlock can be accessed here.

repository structure

Besides the original codes and algorithm paper, our implementation includes:

├── avx2_normal
│   ├── AVX2_uBlock_Windows_128_128.cpp
│   ├── AVX2_uBlock_Windows_128_256.cpp
│   ├── AVX2_uBlock_Windows_256_256.cpp
└── optimized_codes
    ├── uBlock-128-128
    │   ├── 128_128_4block_1key.cpp
    │   └── 128_128_4block_4key.cpp
    ├── uBlock-128-256
    │   ├── 128_256_4block_1key.cpp
    │   └── 128_256_4block_4key.cpp
    └── uBlock-256-256
        ├── 256_256_2block_1key.cpp
        └── 256_256_2block_2key.cpp

The three codes in ./avx2_normal are the simple application of uBlock which use AVX2 intrinsics to replace the original SSE intrinsics, and contains no structure optimization. This version of implementation is used for comparison.

The three codes in ./optimized_codes are our full optimization version of uBlock, containing six implementation in total of three uBlocks divided by key and block sizes.

We also include the original code and designing paper in this repository, under the original_codes directory.

Building

For this repository, one can easily uses cmake to complile all the optimized cpps at once. Using the following steps in the root directory of this project:

mkdir build && cd build
cmake .. && make

After the building process, the compiled executable flies will be showed up under your build directory, and simply run them in the terminal.

Benchmarking

Here we put the detailed benchmarking result of this repository. The testing is under Microsoft Windows 10 Professional Edition 20H2 Build 19042.1466, using AMD Ryzen 9 5900X @3.70 GHz with 32 GB RAM. The compiler is gcc (x86_64-win32-seh-rev0,Built by MinGW-W64 project) 8.1.0。

output__12.jpg

output__14.jpg

output__13 - .jpg

output__13.jpg

Contributors

Longxin Wang github_link

Lei Tian github_link

Yang Hu github_link