cartesi / machine-emulator

The off-chain implementation of the Cartesi Machine
GNU Lesser General Public License v3.0
64 stars 33 forks source link

refactor: replace crypto++ with XKCP #154

Closed edubart closed 11 months ago

edubart commented 11 months ago

This replaces crypto++ with XKCP library.

It also adds supports for AVX2 and AVX-512 for Keccak-256 hash for x86_64 using GCC's multi versioning. Overall AVX2 performance is faster than crypto++.

AMD Ryzen 9 7940HS

    keccak-256

        XKCP AVX512
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      4.030 s ±  0.051 s    [User: 4.007 s, System: 0.021 s]
              Range (min … max):    4.002 s …  4.174 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):     430.9 ms ±   7.1 ms    [User: 6226.7 ms, System: 52.9 ms]
              Range (min … max):   418.8 ms … 438.8 ms    10 runs

        XKCP AVX2
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      3.704 s ±  0.056 s    [User: 3.687 s, System: 0.016 s]
              Range (min … max):    3.675 s …  3.853 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):     485.6 ms ±  11.3 ms    [User: 7031.8 ms, System: 48.8 ms]
              Range (min … max):   467.2 ms … 500.7 ms    10 runs

        XKCP AVX
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      6.986 s ±  0.018 s    [User: 6.971 s, System: 0.015 s]
              Range (min … max):    6.952 s …  7.010 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):      1.042 s ±  0.016 s    [User: 15.814 s, System: 0.045 s]
              Range (min … max):    1.021 s …  1.066 s    10 runs

        XKCP SSSE3
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      4.636 s ±  0.017 s    [User: 4.621 s, System: 0.016 s]
              Range (min … max):    4.610 s …  4.655 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):     692.5 ms ±  22.0 ms    [User: 10326.8 ms, System: 48.0 ms]
              Range (min … max):   649.9 ms … 723.2 ms    10 runs

        XKCP generic64
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      5.082 s ±  0.007 s    [User: 5.064 s, System: 0.018 s]
              Range (min … max):    5.073 s …  5.091 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):     809.4 ms ±  15.8 ms    [User: 12062.4 ms, System: 58.2 ms]
              Range (min … max):   779.1 ms … 825.4 ms    10 runs

        crypto++
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0 --concurrency=update_merkle_tree:1
              Time (mean ± σ):      4.472 s ±  0.021 s    [User: 4.451 s, System: 0.019 s]
              Range (min … max):    4.451 s …  4.526 s    10 runs
            Benchmark 1: cartesi-machine --initial-hash --max-mcycle=0
              Time (mean ± σ):     701.9 ms ±  12.1 ms    [User: 10505.0 ms, System: 43.0 ms]
              Range (min … max):   683.8 ms … 717.9 ms    10 runs

XKCP build depends on xsltproc cli tool on build phase to generate Makefiles, it's provided by the libxslt package and is available in most Linux distributions, also in Homebrew and MacPorts, it was added to README. Other than there is no additional package required.

edubart commented 11 months ago

Closing, we decided to not use XKCP because it is not distributed as a package in many systems (not even Ubuntu), and bundling with multi-versioning for AVX2 support is too hacky.