Closed hicder closed 5 days ago
With StreamingWithSIMDOptimized, in PQ, we will inline the SIMD calculator. This results in 2x performance improvement in most cases for PQ.
StreamingWithSIMDOptimized
Benchmark:
PQ Distance/pq_distance_128_4_4/Scalar time: [111.75 ns 111.78 ns 111.82 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_128_4_4/SIMD time: [213.27 ns 213.38 ns 213.50 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_128_4_4/StreamingWithSIMD time: [95.488 ns 95.529 ns 95.572 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_4_4/StreamingWithSIMDOptimized time: [47.953 ns 48.041 ns 48.161 ns] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe PQ Distance/pq_distance_128_4_8/Scalar time: [111.75 ns 111.79 ns 111.83 ns] Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low severe 2 (2.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_4_8/SIMD time: [213.01 ns 213.11 ns 213.22 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_4_8/StreamingWithSIMD time: [95.836 ns 95.891 ns 95.946 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_128_4_8/StreamingWithSIMDOptimized time: [48.087 ns 48.115 ns 48.143 ns] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_4_16/Scalar time: [112.38 ns 112.75 ns 113.04 ns] PQ Distance/pq_distance_128_4_16/SIMD time: [215.37 ns 215.49 ns 215.62 ns] Found 10 outliers among 100 measurements (10.00%) 8 (8.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_4_16/StreamingWithSIMD time: [96.036 ns 96.077 ns 96.123 ns] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_4_16/StreamingWithSIMDOptimized time: [48.538 ns 48.605 ns 48.686 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_8_4/Scalar time: [66.942 ns 67.039 ns 67.125 ns] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_128_8_4/SIMD time: [214.79 ns 214.85 ns 214.93 ns] Found 14 outliers among 100 measurements (14.00%) 1 (1.00%) low mild 2 (2.00%) high mild 11 (11.00%) high severe PQ Distance/pq_distance_128_8_4/StreamingWithSIMD time: [56.940 ns 56.962 ns 56.985 ns] Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low severe 5 (5.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_8_4/StreamingWithSIMDOptimized time: [29.395 ns 29.441 ns 29.489 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild PQ Distance/pq_distance_128_8_8/Scalar time: [67.882 ns 68.211 ns 68.590 ns] PQ Distance/pq_distance_128_8_8/SIMD time: [214.80 ns 214.85 ns 214.91 ns] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe PQ Distance/pq_distance_128_8_8/StreamingWithSIMD time: [56.836 ns 56.866 ns 56.897 ns] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low severe 3 (3.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_8_8/StreamingWithSIMDOptimized time: [29.268 ns 29.306 ns 29.347 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild PQ Distance/pq_distance_128_8_16/Scalar time: [66.077 ns 66.131 ns 66.188 ns] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_128_8_16/SIMD time: [206.74 ns 206.88 ns 207.07 ns] Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) high mild 6 (6.00%) high severe PQ Distance/pq_distance_128_8_16/StreamingWithSIMD time: [55.219 ns 55.304 ns 55.391 ns] Found 9 outliers among 100 measurements (9.00%) 7 (7.00%) low mild 1 (1.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_8_16/StreamingWithSIMDOptimized time: [28.727 ns 28.776 ns 28.826 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_16_4/Scalar time: [53.886 ns 53.967 ns 54.052 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_128_16_4/SIMD time: [133.67 ns 133.71 ns 133.74 ns] Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe PQ Distance/pq_distance_128_16_4/StreamingWithSIMD time: [41.210 ns 41.344 ns 41.488 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_128_16_4/StreamingWithSIMDOptimized time: [27.865 ns 27.949 ns 28.045 ns] Found 15 outliers among 100 measurements (15.00%) 8 (8.00%) high mild 7 (7.00%) high severe PQ Distance/pq_distance_128_16_8/Scalar time: [54.232 ns 54.313 ns 54.407 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild PQ Distance/pq_distance_128_16_8/SIMD time: [133.60 ns 133.64 ns 133.69 ns] Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 8 (8.00%) high severe PQ Distance/pq_distance_128_16_8/StreamingWithSIMD time: [41.559 ns 41.679 ns 41.799 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_128_16_8/StreamingWithSIMDOptimized time: [28.074 ns 28.225 ns 28.385 ns] Found 6 outliers among 100 measurements (6.00%) 6 (6.00%) high mild PQ Distance/pq_distance_128_16_16/Scalar time: [53.951 ns 54.024 ns 54.095 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_16_16/SIMD time: [129.19 ns 129.23 ns 129.27 ns] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) high mild 5 (5.00%) high severe PQ Distance/pq_distance_128_16_16/StreamingWithSIMD time: [42.261 ns 42.312 ns 42.367 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_16_16/StreamingWithSIMDOptimized time: [28.453 ns 28.542 ns 28.630 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) low mild PQ Distance/pq_distance_128_32_4/Scalar time: [63.255 ns 63.305 ns 63.362 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_32_4/SIMD time: [63.086 ns 63.144 ns 63.209 ns] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_32_4/StreamingWithSIMD time: [27.536 ns 27.611 ns 27.689 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_32_4/StreamingWithSIMDOptimized time: [17.026 ns 17.156 ns 17.290 ns] PQ Distance/pq_distance_128_32_8/Scalar time: [67.020 ns 67.048 ns 67.079 ns] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_32_8/SIMD time: [66.837 ns 66.869 ns 66.900 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_32_8/StreamingWithSIMD time: [34.449 ns 34.499 ns 34.549 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_128_32_8/StreamingWithSIMDOptimized time: [25.263 ns 25.303 ns 25.346 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_32_16/Scalar time: [66.733 ns 66.761 ns 66.791 ns] Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_32_16/SIMD time: [66.378 ns 66.417 ns 66.456 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_32_16/StreamingWithSIMD time: [29.528 ns 29.622 ns 29.716 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_128_32_16/StreamingWithSIMDOptimized time: [19.432 ns 19.665 ns 19.897 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_64_4/Scalar time: [32.272 ns 32.288 ns 32.305 ns] Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 5 (5.00%) high mild PQ Distance/pq_distance_128_64_4/SIMD time: [34.098 ns 34.121 ns 34.146 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_64_4/StreamingWithSIMD time: [20.305 ns 20.423 ns 20.541 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_128_64_4/StreamingWithSIMDOptimized time: [14.519 ns 14.953 ns 15.449 ns] PQ Distance/pq_distance_128_64_8/Scalar time: [32.211 ns 32.227 ns 32.243 ns] Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 4 (4.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_64_8/SIMD time: [34.045 ns 34.062 ns 34.079 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_64_8/StreamingWithSIMD time: [20.404 ns 20.503 ns 20.607 ns] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_128_64_8/StreamingWithSIMDOptimized time: [15.654 ns 16.234 ns 16.879 ns] PQ Distance/pq_distance_128_64_16/Scalar time: [31.293 ns 31.316 ns 31.338 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_64_16/SIMD time: [33.303 ns 33.350 ns 33.399 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_128_64_16/StreamingWithSIMD time: [22.811 ns 22.922 ns 23.044 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe PQ Distance/pq_distance_128_64_16/StreamingWithSIMDOptimized time: [17.949 ns 18.738 ns 19.576 ns] PQ Distance/pq_distance_128_128_4/Scalar time: [19.348 ns 19.433 ns 19.527 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild PQ Distance/pq_distance_128_128_4/SIMD time: [22.293 ns 22.362 ns 22.438 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_128_4/StreamingWithSIMD time: [20.037 ns 20.098 ns 20.161 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_128_128_4/StreamingWithSIMDOptimized time: [11.898 ns 11.948 ns 12.005 ns] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_128_128_8/Scalar time: [31.731 ns 31.762 ns 31.794 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_128_128_8/SIMD time: [31.631 ns 31.684 ns 31.743 ns] Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_128_128_8/StreamingWithSIMD time: [27.717 ns 27.750 ns 27.786 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_128_128_8/StreamingWithSIMDOptimized time: [23.095 ns 23.120 ns 23.147 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_128_128_16/Scalar time: [20.898 ns 20.921 ns 20.946 ns] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low mild 5 (5.00%) high mild PQ Distance/pq_distance_128_128_16/SIMD time: [23.345 ns 23.508 ns 23.661 ns] PQ Distance/pq_distance_128_128_16/StreamingWithSIMD time: [22.527 ns 22.568 ns 22.607 ns] Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) low severe 4 (4.00%) low mild 1 (1.00%) high severe PQ Distance/pq_distance_128_128_16/StreamingWithSIMDOptimized time: [13.975 ns 14.117 ns 14.270 ns] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild PQ Distance/pq_distance_256_4_4/Scalar time: [209.05 ns 209.10 ns 209.16 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe PQ Distance/pq_distance_256_4_4/SIMD time: [408.44 ns 408.71 ns 408.99 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_4_4/StreamingWithSIMD time: [175.01 ns 175.07 ns 175.13 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_4_4/StreamingWithSIMDOptimized time: [86.214 ns 86.516 ns 86.879 ns] PQ Distance/pq_distance_256_4_8/Scalar time: [215.74 ns 215.78 ns 215.83 ns] Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low severe 3 (3.00%) high mild 6 (6.00%) high severe PQ Distance/pq_distance_256_4_8/SIMD time: [408.05 ns 408.23 ns 408.43 ns] Found 14 outliers among 100 measurements (14.00%) 6 (6.00%) high mild 8 (8.00%) high severe PQ Distance/pq_distance_256_4_8/StreamingWithSIMD time: [174.98 ns 175.02 ns 175.07 ns] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low mild 3 (3.00%) high mild 5 (5.00%) high severe PQ Distance/pq_distance_256_4_8/StreamingWithSIMDOptimized time: [89.142 ns 89.434 ns 89.686 ns] Found 11 outliers among 100 measurements (11.00%) 6 (6.00%) low severe 3 (3.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_256_4_16/Scalar time: [218.03 ns 218.12 ns 218.23 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_4_16/SIMD time: [414.44 ns 414.73 ns 415.05 ns] Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_4_16/StreamingWithSIMD time: [176.47 ns 176.57 ns 176.68 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high severe PQ Distance/pq_distance_256_4_16/StreamingWithSIMDOptimized time: [89.714 ns 89.792 ns 89.860 ns] Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_8_4/Scalar time: [123.45 ns 123.50 ns 123.56 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_8_4/SIMD time: [403.73 ns 403.85 ns 403.99 ns] Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe PQ Distance/pq_distance_256_8_4/StreamingWithSIMD time: [94.539 ns 94.603 ns 94.661 ns] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_256_8_4/StreamingWithSIMDOptimized time: [49.279 ns 49.342 ns 49.408 ns] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_256_8_8/Scalar time: [123.58 ns 123.64 ns 123.70 ns] PQ Distance/pq_distance_256_8_8/SIMD time: [400.15 ns 400.26 ns 400.37 ns] Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) high mild 7 (7.00%) high severe PQ Distance/pq_distance_256_8_8/StreamingWithSIMD time: [94.156 ns 94.218 ns 94.285 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_8_8/StreamingWithSIMDOptimized time: [48.542 ns 48.596 ns 48.652 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_8_16/Scalar time: [124.66 ns 124.72 ns 124.78 ns] PQ Distance/pq_distance_256_8_16/SIMD time: [406.10 ns 406.32 ns 406.60 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_8_16/StreamingWithSIMD time: [98.313 ns 98.348 ns 98.387 ns] Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low severe 1 (1.00%) low mild 7 (7.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_8_16/StreamingWithSIMDOptimized time: [50.778 ns 50.824 ns 50.873 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild PQ Distance/pq_distance_256_16_4/Scalar time: [104.12 ns 104.23 ns 104.34 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_16_4/SIMD time: [262.32 ns 264.12 ns 265.78 ns] Found 13 outliers among 100 measurements (13.00%) 12 (12.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_16_4/StreamingWithSIMD time: [65.708 ns 65.847 ns 66.001 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_16_4/StreamingWithSIMDOptimized time: [42.713 ns 42.809 ns 42.907 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_256_16_8/Scalar time: [102.64 ns 102.73 ns 102.82 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_256_16_8/SIMD time: [258.57 ns 258.65 ns 258.74 ns] Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 4 (4.00%) high mild 7 (7.00%) high severe PQ Distance/pq_distance_256_16_8/StreamingWithSIMD time: [62.677 ns 62.779 ns 62.883 ns] PQ Distance/pq_distance_256_16_8/StreamingWithSIMDOptimized time: [41.550 ns 41.671 ns 41.785 ns] PQ Distance/pq_distance_256_16_16/Scalar time: [100.43 ns 100.53 ns 100.64 ns] Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild PQ Distance/pq_distance_256_16_16/SIMD time: [261.29 ns 261.40 ns 261.55 ns] Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) high mild 6 (6.00%) high severe PQ Distance/pq_distance_256_16_16/StreamingWithSIMD time: [65.905 ns 66.046 ns 66.186 ns] Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low mild 4 (4.00%) high mild PQ Distance/pq_distance_256_16_16/StreamingWithSIMDOptimized time: [42.725 ns 42.851 ns 42.976 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_256_32_4/Scalar time: [133.14 ns 133.20 ns 133.27 ns] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) high mild 5 (5.00%) high severe PQ Distance/pq_distance_256_32_4/SIMD time: [132.55 ns 132.72 ns 132.91 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_32_4/StreamingWithSIMD time: [47.306 ns 47.444 ns 47.564 ns] Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_256_32_4/StreamingWithSIMDOptimized time: [33.471 ns 33.635 ns 33.800 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_32_8/Scalar time: [132.66 ns 132.72 ns 132.78 ns] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_32_8/SIMD time: [131.43 ns 131.49 ns 131.56 ns] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe PQ Distance/pq_distance_256_32_8/StreamingWithSIMD time: [47.014 ns 47.125 ns 47.239 ns] PQ Distance/pq_distance_256_32_8/StreamingWithSIMDOptimized time: [33.018 ns 33.289 ns 33.566 ns] PQ Distance/pq_distance_256_32_16/Scalar time: [134.78 ns 134.83 ns 134.88 ns] Found 8 outliers among 100 measurements (8.00%) 6 (6.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_32_16/SIMD time: [132.72 ns 132.78 ns 132.87 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) high mild 3 (3.00%) high severe PQ Distance/pq_distance_256_32_16/StreamingWithSIMD time: [48.935 ns 49.024 ns 49.111 ns] PQ Distance/pq_distance_256_32_16/StreamingWithSIMDOptimized time: [33.833 ns 34.123 ns 34.411 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild PQ Distance/pq_distance_256_64_4/Scalar time: [70.670 ns 70.721 ns 70.774 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_64_4/SIMD time: [70.441 ns 70.473 ns 70.508 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_64_4/StreamingWithSIMD time: [38.372 ns 38.424 ns 38.477 ns] Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_256_64_4/StreamingWithSIMDOptimized time: [23.068 ns 23.190 ns 23.310 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) low mild 1 (1.00%) high mild PQ Distance/pq_distance_256_64_8/Scalar time: [70.654 ns 70.701 ns 70.750 ns] Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_64_8/SIMD time: [70.579 ns 70.620 ns 70.662 ns] Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low mild 5 (5.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_64_8/StreamingWithSIMD time: [37.805 ns 37.864 ns 37.923 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_64_8/StreamingWithSIMDOptimized time: [24.155 ns 24.253 ns 24.357 ns] Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_64_16/Scalar time: [68.409 ns 68.464 ns 68.524 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_64_16/SIMD time: [68.118 ns 68.178 ns 68.241 ns] Found 8 outliers among 100 measurements (8.00%) 7 (7.00%) high mild 1 (1.00%) high severe PQ Distance/pq_distance_256_64_16/StreamingWithSIMD time: [38.220 ns 38.293 ns 38.366 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe PQ Distance/pq_distance_256_64_16/StreamingWithSIMDOptimized time: [24.347 ns 24.462 ns 24.581 ns] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild PQ Distance/pq_distance_256_128_4/Scalar time: [35.418 ns 35.469 ns 35.519 ns] Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild PQ Distance/pq_distance_256_128_4/SIMD time: [35.168 ns 35.247 ns 35.331 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_128_4/StreamingWithSIMD time: [33.506 ns 33.956 ns 34.395 ns] Found 6 outliers among 100 measurements (6.00%) 6 (6.00%) low mild PQ Distance/pq_distance_256_128_4/StreamingWithSIMDOptimized time: [23.185 ns 23.253 ns 23.318 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) low severe 3 (3.00%) low mild PQ Distance/pq_distance_256_128_8/Scalar time: [36.406 ns 36.462 ns 36.524 ns] Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild PQ Distance/pq_distance_256_128_8/SIMD time: [36.213 ns 36.265 ns 36.318 ns] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe PQ Distance/pq_distance_256_128_8/StreamingWithSIMD time: [30.876 ns 32.099 ns 33.236 ns] PQ Distance/pq_distance_256_128_8/StreamingWithSIMDOptimized time: [23.784 ns 23.929 ns 24.073 ns] PQ Distance/pq_distance_256_128_16/Scalar time: [36.807 ns 36.846 ns 36.885 ns] PQ Distance/pq_distance_256_128_16/SIMD time: [37.935 ns 37.972 ns 38.009 ns] Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) low mild 3 (3.00%) high mild PQ Distance/pq_distance_256_128_16/StreamingWithSIMD time: [31.003 ns 31.569 ns 32.101 ns] Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild PQ Distance/pq_distance_256_128_16/StreamingWithSIMDOptimized time: [25.266 ns 25.332 ns 25.401 ns] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) low mild 1 (1.00%) high mild
With
StreamingWithSIMDOptimized
, in PQ, we will inline the SIMD calculator. This results in 2x performance improvement in most cases for PQ.Benchmark: