NVIDIA / cccl

CUDA Core Compute Libraries
Other
1.11k stars 129 forks source link

Prototype tile state for decoupled look-back that bit-packs both tile state and offset into 64-bit offset types #2055

Closed elstehle closed 1 month ago

elstehle commented 1 month ago

Issue https://github.com/NVIDIA/cccl/issues/220 describes a tiles state that uses bit-packing to combine a tile's message type with its status flag. The idea here is to make use of this idea for bit-packing the status flag into the offset type within CUB algorithms that use decoupled look-back with the offset type as the tile's message type (i.e., to propagate the partial or inclusive result of the scan).

elstehle commented 1 month ago

In the following, I have benchmarked three different versions for the DeviceSelect algorithm:

  1. baseline (main branch, no modifications)
  2. baseline but using the same tuning policy for 32 bit offset types as for 64 bit offset types
  3. bit-packed tile state and using the same tuning policy for 32 bit offset types as for 64 bit offset types

Benchmark results when using the same policies for 32 and 64 bit offsets:

Absolute performance numbers | T{ct} | OffsetT{ct} | IsInPlace{ct} | Elements{io} | Entropy | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s | GlobalMem BW | BWUtil | |-------|-------------|---------------|------------------|---------|---------|------------|--------|------------|-------|----------|--------------|--------| | I8 | I32 | false | 2^16 = 65536 | 1.000 | 51232x | 14.932 us | 53.11% | 9.761 us | 2.25% | 6.714G | 13.401 GB/s | 0.66% | | I8 | I32 | false | 2^20 = 1048576 | 1.000 | 40640x | 17.458 us | 41.95% | 12.308 us | 1.90% | 85.196G | 170.061 GB/s | 8.34% | | I8 | I32 | false | 2^24 = 16777216 | 1.000 | 9664x | 57.327 us | 10.75% | 51.803 us | 1.28% | 323.863G | 646.467 GB/s | 31.70% | | I8 | I32 | false | 2^28 = 268435456 | 1.000 | 1060x | 702.723 us | 0.93% | 697.287 us | 0.50% | 384.971G | 768.439 GB/s | 37.69% | | I8 | I32 | false | 2^16 = 65536 | 0.544 | 50976x | 15.200 us | 56.81% | 9.809 us | 2.29% | 6.681G | 10.289 GB/s | 0.50% | | I8 | I32 | false | 2^20 = 1048576 | 0.544 | 42016x | 17.149 us | 44.25% | 11.902 us | 2.06% | 88.103G | 135.603 GB/s | 6.65% | | I8 | I32 | false | 2^24 = 16777216 | 0.544 | 10320x | 54.026 us | 11.54% | 48.468 us | 1.10% | 346.147G | 532.797 GB/s | 26.13% | | I8 | I32 | false | 2^28 = 268435456 | 0.544 | 981x | 647.775 us | 0.98% | 642.356 us | 0.50% | 417.892G | 643.156 GB/s | 31.54% | | I8 | I32 | false | 2^16 = 65536 | 0.000 | 52096x | 14.987 us | 56.26% | 9.599 us | 2.23% | 6.827G | 6.828 GB/s | 0.33% | | I8 | I32 | false | 2^20 = 1048576 | 0.000 | 43408x | 16.747 us | 45.49% | 11.521 us | 2.11% | 91.013G | 91.013 GB/s | 4.46% | | I8 | I32 | false | 2^24 = 16777216 | 0.000 | 11792x | 47.801 us | 14.28% | 42.429 us | 1.14% | 395.418G | 395.418 GB/s | 19.39% | | I8 | I32 | false | 2^28 = 268435456 | 0.000 | 1883x | 533.049 us | 1.17% | 527.489 us | 0.50% | 508.893G | 508.893 GB/s | 24.96% | | I8 | U32 | false | 2^16 = 65536 | 1.000 | 52240x | 14.762 us | 54.46% | 9.574 us | 2.14% | 6.845G | 13.664 GB/s | 0.67% | | I8 | U32 | false | 2^20 = 1048576 | 1.000 | 40880x | 17.633 us | 44.26% | 12.235 us | 1.98% | 85.705G | 171.076 GB/s | 8.39% | | I8 | U32 | false | 2^24 = 16777216 | 1.000 | 9968x | 55.492 us | 10.69% | 50.175 us | 1.23% | 334.374G | 667.447 GB/s | 32.73% | | I8 | U32 | false | 2^28 = 268435456 | 1.000 | 1413x | 683.893 us | 0.95% | 678.384 us | 0.50% | 395.698G | 789.852 GB/s | 38.74% | | I8 | U32 | false | 2^16 = 65536 | 0.544 | 52240x | 14.837 us | 55.13% | 9.572 us | 2.30% | 6.847G | 10.544 GB/s | 0.52% | | I8 | U32 | false | 2^20 = 1048576 | 0.544 | 41536x | 17.421 us | 44.85% | 12.038 us | 2.06% | 87.106G | 134.068 GB/s | 6.58% | | I8 | U32 | false | 2^24 = 16777216 | 0.544 | 10416x | 53.383 us | 11.26% | 48.022 us | 1.09% | 349.367G | 537.754 GB/s | 26.37% | | I8 | U32 | false | 2^28 = 268435456 | 0.544 | 786x | 642.081 us | 1.01% | 636.531 us | 0.50% | 421.716G | 649.042 GB/s | 31.83% | | I8 | U32 | false | 2^16 = 65536 | 0.000 | 53360x | 14.632 us | 56.29% | 9.371 us | 2.29% | 6.993G | 6.994 GB/s | 0.34% | | I8 | U32 | false | 2^20 = 1048576 | 0.000 | 42480x | 17.150 us | 45.77% | 11.773 us | 1.80% | 89.065G | 89.065 GB/s | 4.37% | | I8 | U32 | false | 2^24 = 16777216 | 0.000 | 11808x | 47.724 us | 12.72% | 42.373 us | 1.14% | 395.938G | 395.939 GB/s | 19.42% | | I8 | U32 | false | 2^28 = 268435456 | 0.000 | 1801x | 531.237 us | 1.16% | 525.709 us | 0.50% | 510.616G | 510.616 GB/s | 25.04% | | I8 | I64 | false | 2^16 = 65536 | 1.000 | 53008x | 14.632 us | 55.28% | 9.433 us | 2.32% | 6.948G | 13.868 GB/s | 0.68% | | I8 | I64 | false | 2^20 = 1048576 | 1.000 | 40992x | 17.542 us | 43.85% | 12.202 us | 1.58% | 85.936G | 171.537 GB/s | 8.41% | | I8 | I64 | false | 2^24 = 16777216 | 1.000 | 7952x | 68.267 us | 8.50% | 62.936 us | 0.57% | 266.576G | 532.116 GB/s | 26.10% | | I8 | I64 | false | 2^28 = 268435456 | 1.000 | 568x | 886.192 us | 1.12% | 880.366 us | 0.20% | 304.913G | 608.637 GB/s | 29.85% | | I8 | I64 | false | 2^16 = 65536 | 0.544 | 53408x | 14.613 us | 56.20% | 9.363 us | 2.02% | 7.000G | 10.780 GB/s | 0.53% | | I8 | I64 | false | 2^20 = 1048576 | 0.544 | 41968x | 17.275 us | 45.13% | 11.916 us | 1.92% | 88.001G | 135.446 GB/s | 6.64% | | I8 | I64 | false | 2^24 = 16777216 | 0.544 | 8384x | 65.052 us | 8.99% | 59.705 us | 0.60% | 281.002G | 432.524 GB/s | 21.21% | | I8 | I64 | false | 2^28 = 268435456 | 0.544 | 613x | 822.489 us | 0.73% | 816.964 us | 0.28% | 328.577G | 505.696 GB/s | 24.80% | | I8 | I64 | false | 2^16 = 65536 | 0.000 | 53872x | 14.452 us | 55.80% | 9.282 us | 1.87% | 7.061G | 7.062 GB/s | 0.35% | | I8 | I64 | false | 2^20 = 1048576 | 0.000 | 41936x | 17.382 us | 45.87% | 11.927 us | 2.54% | 87.920G | 87.920 GB/s | 4.31% | | I8 | I64 | false | 2^24 = 16777216 | 0.000 | 9248x | 59.601 us | 10.14% | 54.134 us | 0.64% | 309.922G | 309.922 GB/s | 15.20% | | I8 | I64 | false | 2^28 = 268435456 | 0.000 | 703x | 717.358 us | 0.88% | 711.734 us | 0.38% | 377.157G | 377.157 GB/s | 18.50% | | I8 | U64 | false | 2^16 = 65536 | 1.000 | 51216x | 15.079 us | 54.61% | 9.764 us | 1.71% | 6.712G | 13.398 GB/s | 0.66% | | I8 | U64 | false | 2^20 = 1048576 | 1.000 | 40048x | 17.952 us | 43.89% | 12.488 us | 1.98% | 83.967G | 167.607 GB/s | 8.22% | | I8 | U64 | false | 2^24 = 16777216 | 1.000 | 7904x | 68.986 us | 8.93% | 63.348 us | 0.56% | 264.842G | 528.653 GB/s | 25.93% | | I8 | U64 | false | 2^28 = 268435456 | 1.000 | 568x | 886.301 us | 0.67% | 880.735 us | 0.21% | 304.786G | 608.382 GB/s | 29.84% | | I8 | U64 | false | 2^16 = 65536 | 0.544 | 50784x | 15.322 us | 55.73% | 9.849 us | 1.60% | 6.654G | 10.248 GB/s | 0.50% | | I8 | U64 | false | 2^20 = 1048576 | 0.544 | 41568x | 17.342 us | 44.28% | 12.031 us | 2.27% | 87.156G | 134.145 GB/s | 6.58% | | I8 | U64 | false | 2^24 = 16777216 | 0.544 | 8336x | 65.587 us | 9.37% | 59.988 us | 0.65% | 279.674G | 430.481 GB/s | 21.11% | | I8 | U64 | false | 2^28 = 268435456 | 0.544 | 612x | 822.863 us | 0.74% | 817.292 us | 0.27% | 328.445G | 505.493 GB/s | 24.79% | | I8 | U64 | false | 2^16 = 65536 | 0.000 | 50816x | 15.328 us | 55.95% | 9.840 us | 1.80% | 6.660G | 6.661 GB/s | 0.33% | | I8 | U64 | false | 2^20 = 1048576 | 0.000 | 42592x | 17.046 us | 45.30% | 11.743 us | 2.58% | 89.294G | 89.295 GB/s | 4.38% | | I8 | U64 | false | 2^24 = 16777216 | 0.000 | 9248x | 59.725 us | 10.47% | 54.085 us | 0.74% | 310.199G | 310.199 GB/s | 15.21% | | I8 | U64 | false | 2^28 = 268435456 | 0.000 | 703x | 717.234 us | 0.87% | 711.693 us | 0.38% | 377.179G | 377.179 GB/s | 18.50% | | I16 | I32 | false | 2^16 = 65536 | 1.000 | 48912x | 15.713 us | 53.83% | 10.225 us | 2.05% | 6.409G | 25.638 GB/s | 1.26% | | I16 | I32 | false | 2^20 = 1048576 | 1.000 | 37552x | 18.607 us | 39.82% | 13.317 us | 1.82% | 78.742G | 314.967 GB/s | 15.45% | | I16 | I32 | false | 2^24 = 16777216 | 1.000 | 7920x | 68.862 us | 9.55% | 63.162 us | 3.07% | 265.623G | 1.062 TB/s | 52.11% | | I16 | I32 | false | 2^28 = 268435456 | 1.000 | 1344x | 881.924 us | 1.56% | 876.298 us | 1.42% | 306.329G | 1.225 TB/s | 60.09% | | I16 | I32 | false | 2^16 = 65536 | 0.544 | 48752x | 15.751 us | 53.69% | 10.258 us | 1.66% | 6.389G | 19.742 GB/s | 0.97% | | I16 | I32 | false | 2^20 = 1048576 | 0.544 | 37200x | 18.813 us | 40.03% | 13.444 us | 1.71% | 77.996G | 240.871 GB/s | 11.81% | | I16 | I32 | false | 2^24 = 16777216 | 0.544 | 8912x | 61.730 us | 10.47% | 56.111 us | 2.89% | 299.000G | 923.412 GB/s | 45.29% | | I16 | I32 | false | 2^28 = 268435456 | 0.544 | 1648x | 707.047 us | 1.65% | 701.473 us | 1.44% | 382.674G | 1.182 TB/s | 57.95% | | I16 | I32 | false | 2^16 = 65536 | 0.000 | 49312x | 15.619 us | 54.14% | 10.142 us | 2.67% | 6.462G | 12.924 GB/s | 0.63% | | I16 | I32 | false | 2^20 = 1048576 | 0.000 | 39328x | 17.991 us | 41.60% | 12.718 us | 2.01% | 82.450G | 164.900 GB/s | 8.09% | | I16 | I32 | false | 2^24 = 16777216 | 0.000 | 10464x | 53.512 us | 12.13% | 47.840 us | 2.53% | 350.691G | 701.383 GB/s | 34.40% | | I16 | I32 | false | 2^28 = 268435456 | 0.000 | 2160x | 516.337 us | 1.47% | 510.817 us | 0.99% | 525.502G | 1.051 TB/s | 51.54% | | I16 | U32 | false | 2^16 = 65536 | 1.000 | 49520x | 15.606 us | 54.72% | 10.097 us | 2.50% | 6.490G | 25.962 GB/s | 1.27% | | I16 | U32 | false | 2^20 = 1048576 | 1.000 | 37648x | 18.546 us | 39.73% | 13.281 us | 1.47% | 78.951G | 315.800 GB/s | 15.49% | | I16 | U32 | false | 2^24 = 16777216 | 1.000 | 7968x | 68.405 us | 9.47% | 62.790 us | 3.10% | 267.194G | 1.069 TB/s | 52.42% | | I16 | U32 | false | 2^28 = 268435456 | 1.000 | 1392x | 876.157 us | 1.74% | 870.367 us | 1.41% | 308.416G | 1.234 TB/s | 60.50% | | I16 | U32 | false | 2^16 = 65536 | 0.544 | 48768x | 15.733 us | 53.60% | 10.254 us | 2.09% | 6.391G | 19.749 GB/s | 0.97% | | I16 | U32 | false | 2^20 = 1048576 | 0.544 | 37248x | 18.749 us | 39.75% | 13.429 us | 2.23% | 78.084G | 241.143 GB/s | 11.83% | | I16 | U32 | false | 2^24 = 16777216 | 0.544 | 8960x | 61.453 us | 10.43% | 55.857 us | 2.82% | 300.360G | 927.610 GB/s | 45.49% | | I16 | U32 | false | 2^28 = 268435456 | 0.544 | 1552x | 703.689 us | 1.62% | 698.129 us | 1.41% | 384.507G | 1.187 TB/s | 58.23% | | I16 | U32 | false | 2^16 = 65536 | 0.000 | 49296x | 15.643 us | 54.38% | 10.144 us | 2.78% | 6.461G | 12.922 GB/s | 0.63% | | I16 | U32 | false | 2^20 = 1048576 | 0.000 | 39104x | 18.040 us | 41.19% | 12.786 us | 1.67% | 82.007G | 164.014 GB/s | 8.04% | | I16 | U32 | false | 2^24 = 16777216 | 0.000 | 10480x | 53.185 us | 11.65% | 47.768 us | 2.59% | 351.223G | 702.446 GB/s | 34.45% | | I16 | U32 | false | 2^28 = 268435456 | 0.000 | 2160x | 515.826 us | 1.48% | 510.231 us | 0.99% | 526.105G | 1.052 TB/s | 51.60% | | I16 | I64 | false | 2^16 = 65536 | 1.000 | 50624x | 15.106 us | 53.09% | 9.877 us | 2.30% | 6.635G | 26.542 GB/s | 1.30% | | I16 | I64 | false | 2^20 = 1048576 | 1.000 | 36864x | 19.028 us | 40.35% | 13.568 us | 1.50% | 77.282G | 309.125 GB/s | 15.16% | | I16 | I64 | false | 2^24 = 16777216 | 1.000 | 7744x | 70.133 us | 8.67% | 64.672 us | 1.81% | 259.422G | 1.038 TB/s | 50.89% | | I16 | I64 | false | 2^28 = 268435456 | 1.000 | 2080x | 855.634 us | 1.07% | 850.004 us | 0.84% | 315.805G | 1.263 TB/s | 61.95% | | I16 | I64 | false | 2^16 = 65536 | 0.544 | 51984x | 14.805 us | 54.03% | 9.620 us | 1.92% | 6.813G | 21.052 GB/s | 1.03% | | I16 | I64 | false | 2^20 = 1048576 | 0.544 | 37856x | 18.617 us | 40.99% | 13.212 us | 1.34% | 79.365G | 245.099 GB/s | 12.02% | | I16 | I64 | false | 2^24 = 16777216 | 0.544 | 8576x | 63.774 us | 9.37% | 58.392 us | 1.66% | 287.321G | 887.343 GB/s | 43.52% | | I16 | I64 | false | 2^28 = 268435456 | 0.544 | 2192x | 746.975 us | 0.97% | 741.448 us | 0.61% | 362.042G | 1.118 TB/s | 54.83% | | I16 | I64 | false | 2^16 = 65536 | 0.000 | 52800x | 14.734 us | 55.66% | 9.472 us | 2.10% | 6.919G | 13.839 GB/s | 0.68% | | I16 | I64 | false | 2^20 = 1048576 | 0.000 | 39040x | 18.171 us | 41.94% | 12.810 us | 1.47% | 81.853G | 163.707 GB/s | 8.03% | | I16 | I64 | false | 2^24 = 16777216 | 0.000 | 9696x | 57.150 us | 10.93% | 51.584 us | 1.55% | 325.242G | 650.484 GB/s | 31.90% | | I16 | I64 | false | 2^28 = 268435456 | 0.000 | 2112x | 612.839 us | 1.05% | 607.408 us | 0.55% | 441.936G | 883.872 GB/s | 43.35% | | I16 | U64 | false | 2^16 = 65536 | 1.000 | 50432x | 15.281 us | 54.22% | 9.916 us | 1.97% | 6.609G | 26.437 GB/s | 1.30% | | I16 | U64 | false | 2^20 = 1048576 | 1.000 | 38176x | 18.293 us | 39.75% | 13.100 us | 1.64% | 80.044G | 320.175 GB/s | 15.70% | | I16 | U64 | false | 2^24 = 16777216 | 1.000 | 7792x | 69.805 us | 8.83% | 64.267 us | 1.85% | 261.055G | 1.044 TB/s | 51.21% | | I16 | U64 | false | 2^28 = 268435456 | 1.000 | 1968x | 855.564 us | 1.09% | 850.115 us | 0.88% | 315.764G | 1.263 TB/s | 61.94% | | I16 | U64 | false | 2^16 = 65536 | 0.544 | 50960x | 15.191 us | 54.91% | 9.814 us | 2.00% | 6.678G | 20.635 GB/s | 1.01% | | I16 | U64 | false | 2^20 = 1048576 | 0.544 | 38784x | 18.119 us | 40.66% | 12.892 us | 1.58% | 81.335G | 251.183 GB/s | 12.32% | | I16 | U64 | false | 2^24 = 16777216 | 0.544 | 8560x | 63.951 us | 9.64% | 58.437 us | 1.66% | 287.101G | 886.663 GB/s | 43.48% | | I16 | U64 | false | 2^28 = 268435456 | 0.544 | 2288x | 746.798 us | 0.95% | 741.380 us | 0.60% | 362.075G | 1.118 TB/s | 54.83% | | I16 | U64 | false | 2^16 = 65536 | 0.000 | 51328x | 15.157 us | 55.70% | 9.742 us | 2.03% | 6.727G | 13.456 GB/s | 0.66% | | I16 | U64 | false | 2^20 = 1048576 | 0.000 | 39856x | 17.743 us | 41.52% | 12.548 us | 1.82% | 83.566G | 167.133 GB/s | 8.20% | | I16 | U64 | false | 2^24 = 16777216 | 0.000 | 9696x | 57.098 us | 10.80% | 51.592 us | 1.55% | 325.191G | 650.383 GB/s | 31.90% | | I16 | U64 | false | 2^28 = 268435456 | 0.000 | 2256x | 612.887 us | 1.04% | 607.502 us | 0.53% | 441.868G | 883.736 GB/s | 43.34% | | I32 | I32 | false | 2^16 = 65536 | 1.000 | 47872x | 15.857 us | 51.88% | 10.446 us | 1.99% | 6.274G | 50.189 GB/s | 2.46% | | I32 | I32 | false | 2^20 = 1048576 | 1.000 | 33840x | 19.955 us | 35.08% | 14.782 us | 1.57% | 70.935G | 567.480 GB/s | 27.83% | | I32 | I32 | false | 2^24 = 16777216 | 1.000 | 5552x | 95.573 us | 6.51% | 90.207 us | 2.63% | 185.985G | 1.488 TB/s | 72.97% | | I32 | I32 | false | 2^28 = 268435456 | 1.000 | 1280x | 1.348 ms | 1.07% | 1.343 ms | 0.99% | 199.906G | 1.599 TB/s | 78.43% | | I32 | I32 | false | 2^16 = 65536 | 0.544 | 50128x | 15.230 us | 52.78% | 9.976 us | 2.12% | 6.569G | 40.599 GB/s | 1.99% | | I32 | I32 | false | 2^20 = 1048576 | 0.544 | 33920x | 20.128 us | 36.61% | 14.744 us | 1.69% | 71.118G | 439.261 GB/s | 21.54% | | I32 | I32 | false | 2^24 = 16777216 | 0.544 | 6416x | 83.380 us | 7.39% | 78.023 us | 2.70% | 215.028G | 1.328 TB/s | 65.14% | | I32 | I32 | false | 2^28 = 268435456 | 0.544 | 1648x | 1.077 ms | 1.25% | 1.072 ms | 1.14% | 250.455G | 1.547 TB/s | 75.86% | | I32 | I32 | false | 2^16 = 65536 | 0.000 | 51152x | 15.024 us | 53.77% | 9.777 us | 1.68% | 6.703G | 26.813 GB/s | 1.31% | | I32 | I32 | false | 2^20 = 1048576 | 0.000 | 35856x | 19.292 us | 38.42% | 13.948 us | 1.72% | 75.180G | 300.721 GB/s | 14.75% | | I32 | I32 | false | 2^24 = 16777216 | 0.000 | 8032x | 67.741 us | 8.89% | 62.353 us | 2.01% | 269.069G | 1.076 TB/s | 52.78% | | I32 | I32 | false | 2^28 = 268435456 | 0.000 | 2752x | 654.725 us | 1.70% | 649.132 us | 1.46% | 413.530G | 1.654 TB/s | 81.12% | | I32 | U32 | false | 2^16 = 65536 | 1.000 | 50608x | 15.086 us | 52.85% | 9.880 us | 2.05% | 6.633G | 53.066 GB/s | 2.60% | | I32 | U32 | false | 2^20 = 1048576 | 1.000 | 33536x | 20.247 us | 35.82% | 14.916 us | 1.50% | 70.297G | 562.379 GB/s | 27.58% | | I32 | U32 | false | 2^24 = 16777216 | 1.000 | 5552x | 95.475 us | 6.56% | 90.100 us | 2.67% | 186.207G | 1.490 TB/s | 73.06% | | I32 | U32 | false | 2^28 = 268435456 | 1.000 | 1424x | 1.348 ms | 1.05% | 1.342 ms | 0.97% | 200.017G | 1.600 TB/s | 78.47% | | I32 | U32 | false | 2^16 = 65536 | 0.544 | 50880x | 15.021 us | 52.93% | 9.829 us | 1.92% | 6.668G | 41.207 GB/s | 2.02% | | I32 | U32 | false | 2^20 = 1048576 | 0.544 | 34048x | 20.067 us | 36.74% | 14.687 us | 1.78% | 71.393G | 440.956 GB/s | 21.63% | | I32 | U32 | false | 2^24 = 16777216 | 0.544 | 6416x | 83.388 us | 7.31% | 78.080 us | 2.65% | 214.872G | 1.327 TB/s | 65.09% | | I32 | U32 | false | 2^28 = 268435456 | 0.544 | 1568x | 1.076 ms | 1.25% | 1.070 ms | 1.14% | 250.832G | 1.549 TB/s | 75.97% | | I32 | U32 | false | 2^16 = 65536 | 0.000 | 52096x | 14.771 us | 54.02% | 9.598 us | 2.16% | 6.828G | 27.312 GB/s | 1.34% | | I32 | U32 | false | 2^20 = 1048576 | 0.000 | 35856x | 19.272 us | 38.25% | 13.948 us | 1.40% | 75.177G | 300.709 GB/s | 14.75% | | I32 | U32 | false | 2^24 = 16777216 | 0.000 | 8000x | 68.014 us | 8.96% | 62.565 us | 2.02% | 268.157G | 1.073 TB/s | 52.60% | | I32 | U32 | false | 2^28 = 268435456 | 0.000 | 2752x | 654.500 us | 1.68% | 648.897 us | 1.44% | 413.680G | 1.655 TB/s | 81.15% | | I32 | I64 | false | 2^16 = 65536 | 1.000 | 48816x | 15.498 us | 51.38% | 10.245 us | 1.59% | 6.397G | 51.178 GB/s | 2.51% | | I32 | I64 | false | 2^20 = 1048576 | 1.000 | 32304x | 20.883 us | 34.97% | 15.483 us | 1.73% | 67.724G | 541.792 GB/s | 26.57% | | I32 | I64 | false | 2^24 = 16777216 | 1.000 | 5376x | 98.567 us | 6.04% | 93.141 us | 1.55% | 180.128G | 1.441 TB/s | 70.67% | | I32 | I64 | false | 2^28 = 268435456 | 1.000 | 1664x | 1.359 ms | 0.81% | 1.354 ms | 0.69% | 198.284G | 1.586 TB/s | 77.79% | | I32 | I64 | false | 2^16 = 65536 | 0.544 | 49424x | 15.373 us | 52.02% | 10.119 us | 1.54% | 6.476G | 40.024 GB/s | 1.96% | | I32 | I64 | false | 2^20 = 1048576 | 0.544 | 32960x | 20.617 us | 35.98% | 15.171 us | 1.79% | 69.116G | 426.896 GB/s | 20.94% | | I32 | I64 | false | 2^24 = 16777216 | 0.544 | 6080x | 87.864 us | 6.88% | 82.308 us | 1.27% | 203.836G | 1.259 TB/s | 61.75% | | I32 | I64 | false | 2^28 = 268435456 | 0.544 | 1936x | 1.139 ms | 0.83% | 1.133 ms | 0.66% | 236.868G | 1.463 TB/s | 71.74% | | I32 | I64 | false | 2^16 = 65536 | 0.000 | 49392x | 15.581 us | 54.04% | 10.123 us | 2.50% | 6.474G | 25.896 GB/s | 1.27% | | I32 | I64 | false | 2^20 = 1048576 | 0.000 | 35424x | 19.326 us | 36.97% | 14.119 us | 1.33% | 74.267G | 297.070 GB/s | 14.57% | | I32 | I64 | false | 2^24 = 16777216 | 0.000 | 7232x | 74.833 us | 8.16% | 69.230 us | 0.96% | 242.340G | 969.361 GB/s | 47.54% | | I32 | I64 | false | 2^28 = 268435456 | 0.000 | 2992x | 793.658 us | 1.09% | 788.107 us | 0.82% | 340.608G | 1.362 TB/s | 66.82% | | I32 | U64 | false | 2^16 = 65536 | 1.000 | 47568x | 15.925 us | 51.62% | 10.512 us | 2.60% | 6.234G | 49.876 GB/s | 2.45% | | I32 | U64 | false | 2^20 = 1048576 | 1.000 | 32720x | 20.548 us | 34.53% | 15.285 us | 1.97% | 68.602G | 548.816 GB/s | 26.92% | | I32 | U64 | false | 2^24 = 16777216 | 1.000 | 5376x | 98.710 us | 6.21% | 93.131 us | 1.58% | 180.146G | 1.441 TB/s | 70.68% | | I32 | U64 | false | 2^28 = 268435456 | 1.000 | 1616x | 1.359 ms | 0.81% | 1.354 ms | 0.69% | 198.284G | 1.586 TB/s | 77.80% | | I32 | U64 | false | 2^16 = 65536 | 0.544 | 47760x | 15.967 us | 52.61% | 10.472 us | 2.08% | 6.258G | 38.678 GB/s | 1.90% | | I32 | U64 | false | 2^20 = 1048576 | 0.544 | 33776x | 20.050 us | 35.51% | 14.807 us | 1.51% | 70.819G | 437.410 GB/s | 21.45% | | I32 | U64 | false | 2^24 = 16777216 | 0.544 | 6080x | 87.895 us | 6.92% | 82.310 us | 1.25% | 203.829G | 1.259 TB/s | 61.74% | | I32 | U64 | false | 2^28 = 268435456 | 0.544 | 2032x | 1.139 ms | 0.84% | 1.133 ms | 0.68% | 236.860G | 1.463 TB/s | 71.74% | | I32 | U64 | false | 2^16 = 65536 | 0.000 | 51056x | 15.158 us | 54.88% | 9.795 us | 2.10% | 6.691G | 26.764 GB/s | 1.31% | | I32 | U64 | false | 2^20 = 1048576 | 0.000 | 35856x | 19.087 us | 36.89% | 13.950 us | 1.37% | 75.169G | 300.675 GB/s | 14.75% | | I32 | U64 | false | 2^24 = 16777216 | 0.000 | 7280x | 74.255 us | 8.06% | 68.756 us | 0.97% | 244.013G | 976.050 GB/s | 47.87% | | I32 | U64 | false | 2^28 = 268435456 | 0.000 | 2992x | 794.224 us | 1.07% | 788.785 us | 0.81% | 340.315G | 1.361 TB/s | 66.76% | | I64 | I32 | false | 2^16 = 65536 | 1.000 | 48608x | 15.574 us | 51.48% | 10.289 us | 1.85% | 6.369G | 101.910 GB/s | 5.00% | | I64 | I32 | false | 2^20 = 1048576 | 1.000 | 27744x | 23.167 us | 28.57% | 18.028 us | 1.35% | 58.164G | 930.621 GB/s | 45.64% | | I64 | I32 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.087 us | 3.88% | 165.528 us | 1.92% | 101.356G | 1.622 TB/s | 79.53% | | I64 | I32 | false | 2^28 = 268435456 | 1.000 | 384x | 2.564 ms | 0.57% | 2.559 ms | 0.53% | 104.917G | 1.679 TB/s | 82.33% | | I64 | I32 | false | 2^16 = 65536 | 0.544 | 48976x | 15.552 us | 52.41% | 10.211 us | 2.13% | 6.418G | 79.327 GB/s | 3.89% | | I64 | I32 | false | 2^20 = 1048576 | 0.544 | 28336x | 22.812 us | 29.31% | 17.651 us | 1.28% | 59.406G | 733.835 GB/s | 35.99% | | I64 | I32 | false | 2^24 = 16777216 | 0.544 | 3632x | 143.685 us | 4.76% | 138.176 us | 2.58% | 121.419G | 1.500 TB/s | 73.56% | | I64 | I32 | false | 2^28 = 268435456 | 0.544 | 672x | 2.031 ms | 0.81% | 2.025 ms | 0.76% | 132.534G | 1.637 TB/s | 80.29% | | I64 | I32 | false | 2^16 = 65536 | 0.000 | 51536x | 15.050 us | 55.27% | 9.702 us | 2.03% | 6.755G | 54.038 GB/s | 2.65% | | I64 | I32 | false | 2^20 = 1048576 | 0.000 | 30560x | 21.484 us | 31.37% | 16.361 us | 1.28% | 64.088G | 512.707 GB/s | 25.14% | | I64 | I32 | false | 2^24 = 16777216 | 0.000 | 5152x | 102.755 us | 5.95% | 97.269 us | 1.83% | 172.483G | 1.380 TB/s | 67.67% | | I64 | I32 | false | 2^28 = 268435456 | 0.000 | 2448x | 1.236 ms | 0.97% | 1.231 ms | 0.86% | 218.138G | 1.745 TB/s | 85.58% | | I64 | U32 | false | 2^16 = 65536 | 1.000 | 48528x | 15.674 us | 52.24% | 10.303 us | 1.99% | 6.361G | 101.770 GB/s | 4.99% | | I64 | U32 | false | 2^20 = 1048576 | 1.000 | 27792x | 23.156 us | 28.76% | 17.992 us | 1.32% | 58.281G | 932.500 GB/s | 45.73% | | I64 | U32 | false | 2^24 = 16777216 | 1.000 | 3024x | 170.785 us | 3.75% | 165.426 us | 1.89% | 101.418G | 1.623 TB/s | 79.58% | | I64 | U32 | false | 2^28 = 268435456 | 1.000 | 196x | 2.561 ms | 0.53% | 2.556 ms | 0.47% | 105.040G | 1.681 TB/s | 82.42% | | I64 | U32 | false | 2^16 = 65536 | 0.544 | 49664x | 15.282 us | 51.85% | 10.070 us | 1.76% | 6.508G | 80.439 GB/s | 3.94% | | I64 | U32 | false | 2^20 = 1048576 | 0.544 | 27392x | 23.630 us | 29.46% | 18.264 us | 1.32% | 57.413G | 709.218 GB/s | 34.78% | | I64 | U32 | false | 2^24 = 16777216 | 0.544 | 3616x | 143.984 us | 4.72% | 138.492 us | 2.54% | 121.142G | 1.497 TB/s | 73.39% | | I64 | U32 | false | 2^28 = 268435456 | 0.544 | 720x | 2.034 ms | 0.85% | 2.028 ms | 0.80% | 132.364G | 1.635 TB/s | 80.18% | | I64 | U32 | false | 2^16 = 65536 | 0.000 | 50928x | 15.051 us | 53.45% | 9.818 us | 2.17% | 6.675G | 53.401 GB/s | 2.62% | | I64 | U32 | false | 2^20 = 1048576 | 0.000 | 29808x | 22.193 us | 32.40% | 16.774 us | 1.55% | 62.511G | 500.084 GB/s | 24.53% | | I64 | U32 | false | 2^24 = 16777216 | 0.000 | 5120x | 103.150 us | 5.89% | 97.699 us | 1.83% | 171.724G | 1.374 TB/s | 67.37% | | I64 | U32 | false | 2^28 = 268435456 | 0.000 | 2384x | 1.237 ms | 1.02% | 1.232 ms | 0.91% | 217.964G | 1.744 TB/s | 85.52% | | I64 | I64 | false | 2^16 = 65536 | 1.000 | 46800x | 16.013 us | 49.97% | 10.687 us | 2.00% | 6.132G | 98.117 GB/s | 4.81% | | I64 | I64 | false | 2^20 = 1048576 | 1.000 | 25552x | 24.987 us | 27.85% | 19.573 us | 2.88% | 53.573G | 857.173 GB/s | 42.04% | | I64 | I64 | false | 2^24 = 16777216 | 1.000 | 3008x | 171.918 us | 3.57% | 166.465 us | 1.39% | 100.785G | 1.613 TB/s | 79.08% | | I64 | I64 | false | 2^28 = 268435456 | 1.000 | 197x | 2.553 ms | 0.46% | 2.547 ms | 0.39% | 105.398G | 1.686 TB/s | 82.70% | | I64 | I64 | false | 2^16 = 65536 | 0.544 | 47232x | 15.885 us | 50.18% | 10.587 us | 1.80% | 6.190G | 76.513 GB/s | 3.75% | | I64 | I64 | false | 2^20 = 1048576 | 0.544 | 26240x | 24.464 us | 28.50% | 19.058 us | 2.17% | 55.020G | 679.667 GB/s | 33.33% | | I64 | I64 | false | 2^24 = 16777216 | 0.544 | 3584x | 145.394 us | 4.20% | 139.905 us | 1.48% | 119.918G | 1.481 TB/s | 72.65% | | I64 | I64 | false | 2^28 = 268435456 | 0.544 | 344x | 2.079 ms | 0.57% | 2.074 ms | 0.50% | 129.459G | 1.599 TB/s | 78.42% | | I64 | I64 | false | 2^16 = 65536 | 0.000 | 50784x | 15.121 us | 53.67% | 9.846 us | 2.23% | 6.656G | 53.249 GB/s | 2.61% | | I64 | I64 | false | 2^20 = 1048576 | 0.000 | 28384x | 23.085 us | 31.14% | 17.616 us | 1.90% | 59.522G | 476.180 GB/s | 23.35% | | I64 | I64 | false | 2^24 = 16777216 | 0.000 | 4800x | 109.901 us | 5.28% | 104.476 us | 0.93% | 160.584G | 1.285 TB/s | 63.00% | | I64 | I64 | false | 2^28 = 268435456 | 0.000 | 2928x | 1.337 ms | 0.78% | 1.332 ms | 0.66% | 201.580G | 1.613 TB/s | 79.09% | | I64 | U64 | false | 2^16 = 65536 | 1.000 | 47536x | 15.822 us | 50.52% | 10.521 us | 2.03% | 6.229G | 99.666 GB/s | 4.89% | | I64 | U64 | false | 2^20 = 1048576 | 1.000 | 25264x | 25.228 us | 29.28% | 19.801 us | 2.98% | 52.956G | 847.291 GB/s | 41.55% | | I64 | U64 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.542 us | 3.50% | 166.210 us | 1.38% | 100.940G | 1.615 TB/s | 79.21% | | I64 | U64 | false | 2^28 = 268435456 | 1.000 | 197x | 2.552 ms | 0.48% | 2.547 ms | 0.42% | 105.413G | 1.687 TB/s | 82.72% | | I64 | U64 | false | 2^16 = 65536 | 0.544 | 48928x | 15.450 us | 51.24% | 10.222 us | 2.19% | 6.411G | 79.242 GB/s | 3.89% | | I64 | U64 | false | 2^20 = 1048576 | 0.544 | 26464x | 24.237 us | 28.38% | 18.897 us | 2.14% | 55.490G | 685.461 GB/s | 33.62% | | I64 | U64 | false | 2^24 = 16777216 | 0.544 | 3584x | 145.235 us | 4.25% | 139.701 us | 1.53% | 120.094G | 1.484 TB/s | 72.76% | | I64 | U64 | false | 2^28 = 268435456 | 0.544 | 328x | 2.078 ms | 0.57% | 2.072 ms | 0.50% | 129.523G | 1.600 TB/s | 78.46% | | I64 | U64 | false | 2^16 = 65536 | 0.000 | 50288x | 15.310 us | 54.04% | 9.945 us | 1.68% | 6.590G | 52.717 GB/s | 2.59% | | I64 | U64 | false | 2^20 = 1048576 | 0.000 | 29200x | 22.322 us | 30.43% | 17.128 us | 1.86% | 61.218G | 489.747 GB/s | 24.02% | | I64 | U64 | false | 2^24 = 16777216 | 0.000 | 4800x | 109.786 us | 5.37% | 104.278 us | 0.92% | 160.889G | 1.287 TB/s | 63.12% | | I64 | U64 | false | 2^28 = 268435456 | 0.000 | 2960x | 1.337 ms | 0.79% | 1.331 ms | 0.67% | 201.639G | 1.613 TB/s | 79.11% | | I128 | I32 | false | 2^16 = 65536 | 1.000 | 43440x | 16.920 us | 47.06% | 11.512 us | 1.83% | 5.693G | 182.166 GB/s | 8.93% | | I128 | I32 | false | 2^20 = 1048576 | 1.000 | 16496x | 35.613 us | 17.58% | 30.330 us | 2.08% | 34.572G | 1.106 TB/s | 54.26% | | I128 | I32 | false | 2^24 = 16777216 | 1.000 | 1472x | 348.191 us | 2.11% | 342.639 us | 1.35% | 48.965G | 1.567 TB/s | 76.84% | | I128 | I32 | false | 2^28 = 268435456 | 1.000 | 94x | 5.359 ms | 0.44% | 5.353 ms | 0.42% | 50.147G | 1.605 TB/s | 78.70% | | I128 | I32 | false | 2^16 = 65536 | 0.544 | 44016x | 16.752 us | 47.59% | 11.360 us | 2.05% | 5.769G | 142.605 GB/s | 6.99% | | I128 | I32 | false | 2^20 = 1048576 | 0.544 | 17728x | 33.487 us | 18.91% | 28.207 us | 2.48% | 37.174G | 918.418 GB/s | 45.04% | | I128 | I32 | false | 2^24 = 16777216 | 0.544 | 1792x | 285.170 us | 2.46% | 279.679 us | 1.46% | 59.987G | 1.482 TB/s | 72.69% | | I128 | I32 | false | 2^28 = 268435456 | 0.544 | 117x | 4.280 ms | 0.41% | 4.274 ms | 0.38% | 62.802G | 1.551 TB/s | 76.09% | | I128 | I32 | false | 2^16 = 65536 | 0.000 | 47808x | 15.787 us | 51.04% | 10.460 us | 2.09% | 6.266G | 100.250 GB/s | 4.92% | | I128 | I32 | false | 2^20 = 1048576 | 0.000 | 19360x | 31.023 us | 20.25% | 25.833 us | 2.31% | 40.590G | 649.445 GB/s | 31.85% | | I128 | I32 | false | 2^24 = 16777216 | 0.000 | 2736x | 188.662 us | 3.15% | 183.330 us | 1.17% | 91.514G | 1.464 TB/s | 71.81% | | I128 | I32 | false | 2^28 = 268435456 | 0.000 | 199x | 2.520 ms | 0.36% | 2.514 ms | 0.28% | 106.783G | 1.709 TB/s | 83.79% | | I128 | U32 | false | 2^16 = 65536 | 1.000 | 45088x | 16.304 us | 47.11% | 11.092 us | 1.97% | 5.909G | 189.073 GB/s | 9.27% | | I128 | U32 | false | 2^20 = 1048576 | 1.000 | 16512x | 35.624 us | 17.82% | 30.286 us | 2.41% | 34.623G | 1.108 TB/s | 54.34% | | I128 | U32 | false | 2^24 = 16777216 | 1.000 | 1456x | 349.691 us | 2.03% | 344.316 us | 1.30% | 48.726G | 1.559 TB/s | 76.47% | | I128 | U32 | false | 2^28 = 268435456 | 1.000 | 93x | 5.384 ms | 0.39% | 5.378 ms | 0.37% | 49.910G | 1.597 TB/s | 78.33% | | I128 | U32 | false | 2^16 = 65536 | 0.544 | 45824x | 16.052 us | 47.21% | 10.912 us | 2.09% | 6.006G | 148.465 GB/s | 7.28% | | I128 | U32 | false | 2^20 = 1048576 | 0.544 | 17552x | 33.954 us | 19.33% | 28.508 us | 2.80% | 36.782G | 908.725 GB/s | 44.57% | | I128 | U32 | false | 2^24 = 16777216 | 0.544 | 1792x | 286.476 us | 2.41% | 281.121 us | 1.47% | 59.680G | 1.474 TB/s | 72.31% | | I128 | U32 | false | 2^28 = 268435456 | 0.544 | 117x | 4.308 ms | 0.46% | 4.302 ms | 0.43% | 62.400G | 1.542 TB/s | 75.60% | | I128 | U32 | false | 2^16 = 65536 | 0.000 | 49312x | 15.355 us | 51.51% | 10.143 us | 2.12% | 6.461G | 103.384 GB/s | 5.07% | | I128 | U32 | false | 2^20 = 1048576 | 0.000 | 19296x | 31.383 us | 21.21% | 25.930 us | 2.46% | 40.438G | 647.008 GB/s | 31.73% | | I128 | U32 | false | 2^24 = 16777216 | 0.000 | 2720x | 189.497 us | 3.19% | 184.049 us | 1.19% | 91.156G | 1.458 TB/s | 71.53% | | I128 | U32 | false | 2^28 = 268435456 | 0.000 | 198x | 2.534 ms | 0.35% | 2.528 ms | 0.27% | 106.181G | 1.699 TB/s | 83.32% | | I128 | I64 | false | 2^16 = 65536 | 1.000 | 43632x | 16.843 us | 47.06% | 11.463 us | 1.69% | 5.717G | 182.958 GB/s | 8.97% | | I128 | I64 | false | 2^20 = 1048576 | 1.000 | 16592x | 35.379 us | 17.50% | 30.141 us | 1.72% | 34.789G | 1.113 TB/s | 54.60% | | I128 | I64 | false | 2^24 = 16777216 | 1.000 | 1504x | 340.379 us | 1.96% | 334.873 us | 1.06% | 50.100G | 1.603 TB/s | 78.63% | | I128 | I64 | false | 2^28 = 268435456 | 1.000 | 96x | 5.226 ms | 0.31% | 5.220 ms | 0.29% | 51.424G | 1.646 TB/s | 80.70% | | I128 | I64 | false | 2^16 = 65536 | 0.544 | 44512x | 16.612 us | 47.98% | 11.234 us | 1.50% | 5.834G | 144.210 GB/s | 7.07% | | I128 | I64 | false | 2^20 = 1048576 | 0.544 | 17776x | 33.288 us | 18.39% | 28.142 us | 1.79% | 37.260G | 920.532 GB/s | 45.15% | | I128 | I64 | false | 2^24 = 16777216 | 0.544 | 1824x | 281.167 us | 2.33% | 275.627 us | 1.17% | 60.869G | 1.504 TB/s | 73.75% | | I128 | I64 | false | 2^28 = 268435456 | 0.544 | 119x | 4.235 ms | 0.33% | 4.229 ms | 0.30% | 63.468G | 1.568 TB/s | 76.89% | | I128 | I64 | false | 2^16 = 65536 | 0.000 | 47712x | 15.807 us | 50.92% | 10.480 us | 1.77% | 6.253G | 100.053 GB/s | 4.91% | | I128 | I64 | false | 2^20 = 1048576 | 0.000 | 19440x | 31.013 us | 20.61% | 25.738 us | 1.91% | 40.741G | 651.848 GB/s | 31.97% | | I128 | I64 | false | 2^24 = 16777216 | 0.000 | 2720x | 189.881 us | 3.13% | 184.332 us | 0.84% | 91.016G | 1.456 TB/s | 71.42% | | I128 | I64 | false | 2^28 = 268435456 | 0.000 | 198x | 2.539 ms | 0.31% | 2.534 ms | 0.20% | 105.951G | 1.695 TB/s | 83.14% | | I128 | U64 | false | 2^16 = 65536 | 1.000 | 42848x | 17.160 us | 47.15% | 11.671 us | 2.16% | 5.615G | 179.688 GB/s | 8.81% | | I128 | U64 | false | 2^20 = 1048576 | 1.000 | 16432x | 35.763 us | 17.54% | 30.457 us | 1.66% | 34.428G | 1.102 TB/s | 54.03% | | I128 | U64 | false | 2^24 = 16777216 | 1.000 | 1504x | 341.267 us | 2.00% | 335.663 us | 1.09% | 49.982G | 1.599 TB/s | 78.44% | | I128 | U64 | false | 2^28 = 268435456 | 1.000 | 96x | 5.223 ms | 0.31% | 5.216 ms | 0.28% | 51.460G | 1.647 TB/s | 80.76% | | I128 | U64 | false | 2^16 = 65536 | 0.544 | 43424x | 16.956 us | 47.36% | 11.515 us | 1.71% | 5.691G | 140.685 GB/s | 6.90% | | I128 | U64 | false | 2^20 = 1048576 | 0.544 | 17680x | 33.643 us | 18.99% | 28.302 us | 1.81% | 37.049G | 915.328 GB/s | 44.89% | | I128 | U64 | false | 2^24 = 16777216 | 0.544 | 1824x | 281.700 us | 2.36% | 276.116 us | 1.21% | 60.761G | 1.501 TB/s | 73.62% | | I128 | U64 | false | 2^28 = 268435456 | 0.544 | 119x | 4.236 ms | 0.37% | 4.230 ms | 0.34% | 63.466G | 1.568 TB/s | 76.89% | | I128 | U64 | false | 2^16 = 65536 | 0.000 | 45872x | 16.342 us | 51.79% | 10.901 us | 2.07% | 6.012G | 96.191 GB/s | 4.72% | | I128 | U64 | false | 2^20 = 1048576 | 0.000 | 19600x | 30.792 us | 20.80% | 25.518 us | 2.08% | 41.092G | 657.465 GB/s | 32.24% | | I128 | U64 | false | 2^24 = 16777216 | 0.000 | 2720x | 189.965 us | 3.14% | 184.395 us | 0.83% | 90.985G | 1.456 TB/s | 71.39% | | I128 | U64 | false | 2^28 = 268435456 | 0.000 | 198x | 2.540 ms | 0.31% | 2.535 ms | 0.21% | 105.909G | 1.695 TB/s | 83.11% | | F32 | I32 | false | 2^16 = 65536 | 1.000 | 47952x | 15.883 us | 52.46% | 10.428 us | 1.80% | 6.285G | 50.279 GB/s | 2.47% | | F32 | I32 | false | 2^20 = 1048576 | 1.000 | 33216x | 20.367 us | 35.38% | 15.055 us | 1.63% | 69.651G | 557.210 GB/s | 27.33% | | F32 | I32 | false | 2^24 = 16777216 | 1.000 | 5536x | 96.034 us | 6.76% | 90.431 us | 2.66% | 185.526G | 1.484 TB/s | 72.79% | | F32 | I32 | false | 2^28 = 268435456 | 1.000 | 1488x | 1.371 ms | 1.07% | 1.365 ms | 0.99% | 196.663G | 1.573 TB/s | 77.16% | | F32 | I32 | false | 2^16 = 65536 | 0.544 | 49360x | 15.623 us | 54.31% | 10.132 us | 2.49% | 6.468G | 28.190 GB/s | 1.38% | | F32 | I32 | false | 2^20 = 1048576 | 0.544 | 35456x | 19.352 us | 37.30% | 14.103 us | 1.37% | 74.349G | 323.633 GB/s | 15.87% | | F32 | I32 | false | 2^24 = 16777216 | 0.544 | 7808x | 69.533 us | 8.67% | 64.156 us | 2.15% | 261.507G | 1.138 TB/s | 55.82% | | F32 | I32 | false | 2^28 = 268435456 | 0.544 | 2208x | 777.911 us | 1.27% | 772.320 us | 1.04% | 347.570G | 1.513 TB/s | 74.18% | | F32 | I32 | false | 2^16 = 65536 | 0.000 | 51904x | 14.863 us | 54.36% | 9.636 us | 2.06% | 6.802G | 27.206 GB/s | 1.33% | | F32 | I32 | false | 2^20 = 1048576 | 0.000 | 35472x | 19.518 us | 38.54% | 14.098 us | 1.85% | 74.377G | 297.509 GB/s | 14.59% | | F32 | I32 | false | 2^24 = 16777216 | 0.000 | 8048x | 67.623 us | 10.02% | 62.238 us | 2.04% | 269.565G | 1.078 TB/s | 52.88% | | F32 | I32 | false | 2^28 = 268435456 | 0.000 | 2768x | 654.614 us | 1.67% | 649.088 us | 1.43% | 413.558G | 1.654 TB/s | 81.13% | | F32 | U32 | false | 2^16 = 65536 | 1.000 | 50864x | 15.061 us | 53.34% | 9.831 us | 1.98% | 6.666G | 53.332 GB/s | 2.62% | | F32 | U32 | false | 2^20 = 1048576 | 1.000 | 33744x | 20.093 us | 35.67% | 14.821 us | 1.34% | 70.747G | 565.977 GB/s | 27.76% | | F32 | U32 | false | 2^24 = 16777216 | 1.000 | 5552x | 95.477 us | 6.50% | 90.143 us | 2.66% | 186.117G | 1.489 TB/s | 73.02% | | F32 | U32 | false | 2^28 = 268435456 | 1.000 | 1472x | 1.368 ms | 1.07% | 1.363 ms | 0.99% | 196.990G | 1.576 TB/s | 77.29% | | F32 | U32 | false | 2^16 = 65536 | 0.544 | 52800x | 14.620 us | 54.51% | 9.471 us | 1.94% | 6.919G | 30.155 GB/s | 1.48% | | F32 | U32 | false | 2^20 = 1048576 | 0.544 | 35744x | 19.251 us | 37.67% | 13.992 us | 1.78% | 74.940G | 326.203 GB/s | 16.00% | | F32 | U32 | false | 2^24 = 16777216 | 0.544 | 7824x | 69.221 us | 8.60% | 63.913 us | 2.16% | 262.503G | 1.142 TB/s | 56.03% | | F32 | U32 | false | 2^28 = 268435456 | 0.544 | 2176x | 776.628 us | 1.25% | 771.119 us | 1.03% | 348.111G | 1.515 TB/s | 74.30% | | F32 | U32 | false | 2^16 = 65536 | 0.000 | 52368x | 14.711 us | 54.18% | 9.548 us | 2.08% | 6.864G | 27.455 GB/s | 1.35% | | F32 | U32 | false | 2^20 = 1048576 | 0.000 | 35456x | 19.368 us | 37.41% | 14.103 us | 1.62% | 74.350G | 297.400 GB/s | 14.59% | | F32 | U32 | false | 2^24 = 16777216 | 0.000 | 8048x | 67.550 us | 8.80% | 62.239 us | 2.05% | 269.559G | 1.078 TB/s | 52.88% | | F32 | U32 | false | 2^28 = 268435456 | 0.000 | 2768x | 654.745 us | 1.69% | 649.220 us | 1.44% | 413.474G | 1.654 TB/s | 81.11% | | F32 | I64 | false | 2^16 = 65536 | 1.000 | 50224x | 15.191 us | 52.69% | 9.957 us | 2.02% | 6.582G | 52.657 GB/s | 2.58% | | F32 | I64 | false | 2^20 = 1048576 | 1.000 | 32688x | 20.626 us | 34.89% | 15.301 us | 1.59% | 68.529G | 548.229 GB/s | 26.89% | | F32 | I64 | false | 2^24 = 16777216 | 1.000 | 5392x | 98.184 us | 5.98% | 92.832 us | 1.56% | 180.728G | 1.446 TB/s | 70.91% | | F32 | I64 | false | 2^28 = 268435456 | 1.000 | 1632x | 1.376 ms | 0.78% | 1.370 ms | 0.66% | 195.936G | 1.567 TB/s | 76.87% | | F32 | I64 | false | 2^16 = 65536 | 0.544 | 52528x | 14.747 us | 55.05% | 9.519 us | 2.07% | 6.885G | 30.005 GB/s | 1.47% | | F32 | I64 | false | 2^20 = 1048576 | 0.544 | 35104x | 19.596 us | 37.63% | 14.245 us | 1.48% | 73.608G | 320.408 GB/s | 15.71% | | F32 | I64 | false | 2^24 = 16777216 | 0.544 | 7104x | 75.764 us | 7.68% | 70.411 us | 0.97% | 238.275G | 1.037 TB/s | 50.86% | | F32 | I64 | false | 2^28 = 268435456 | 0.544 | 2896x | 868.595 us | 0.92% | 863.031 us | 0.66% | 311.038G | 1.354 TB/s | 66.39% | | F32 | I64 | false | 2^16 = 65536 | 0.000 | 51616x | 14.897 us | 53.92% | 9.689 us | 2.75% | 6.764G | 27.058 GB/s | 1.33% | | F32 | I64 | false | 2^20 = 1048576 | 0.000 | 34992x | 19.643 us | 37.50% | 14.294 us | 1.40% | 73.360G | 293.441 GB/s | 14.39% | | F32 | I64 | false | 2^24 = 16777216 | 0.000 | 7280x | 74.319 us | 8.05% | 68.824 us | 0.95% | 243.771G | 975.083 GB/s | 47.82% | | F32 | I64 | false | 2^28 = 268435456 | 0.000 | 2992x | 793.377 us | 1.10% | 787.897 us | 0.83% | 340.699G | 1.363 TB/s | 66.84% | | F32 | U64 | false | 2^16 = 65536 | 1.000 | 48656x | 15.659 us | 52.42% | 10.280 us | 1.85% | 6.375G | 51.004 GB/s | 2.50% | | F32 | U64 | false | 2^20 = 1048576 | 1.000 | 33248x | 20.191 us | 34.29% | 15.044 us | 1.48% | 69.700G | 557.600 GB/s | 27.35% | | F32 | U64 | false | 2^24 = 16777216 | 1.000 | 5408x | 98.215 us | 6.16% | 92.708 us | 1.58% | 180.968G | 1.448 TB/s | 71.00% | | F32 | U64 | false | 2^28 = 268435456 | 1.000 | 1552x | 1.375 ms | 0.79% | 1.370 ms | 0.68% | 195.993G | 1.568 TB/s | 76.90% | | F32 | U64 | false | 2^16 = 65536 | 0.544 | 51264x | 15.116 us | 55.08% | 9.755 us | 2.08% | 6.718G | 29.281 GB/s | 1.44% | | F32 | U64 | false | 2^20 = 1048576 | 0.544 | 35600x | 19.228 us | 36.97% | 14.048 us | 1.49% | 74.644G | 324.917 GB/s | 15.93% | | F32 | U64 | false | 2^24 = 16777216 | 0.544 | 7120x | 75.884 us | 7.94% | 70.345 us | 0.96% | 238.499G | 1.038 TB/s | 50.91% | | F32 | U64 | false | 2^28 = 268435456 | 0.544 | 2896x | 868.463 us | 0.92% | 862.962 us | 0.66% | 311.063G | 1.354 TB/s | 66.39% | | F32 | U64 | false | 2^16 = 65536 | 0.000 | 51104x | 15.112 us | 54.58% | 9.787 us | 2.82% | 6.696G | 26.786 GB/s | 1.31% | | F32 | U64 | false | 2^20 = 1048576 | 0.000 | 35536x | 19.258 us | 36.95% | 14.073 us | 1.42% | 74.509G | 298.037 GB/s | 14.62% | | F32 | U64 | false | 2^24 = 16777216 | 0.000 | 7280x | 74.261 us | 8.06% | 68.765 us | 0.96% | 243.980G | 975.920 GB/s | 47.86% | | F32 | U64 | false | 2^28 = 268435456 | 0.000 | 2976x | 793.254 us | 1.23% | 787.756 us | 0.82% | 340.760G | 1.363 TB/s | 66.85% | | F64 | I32 | false | 2^16 = 65536 | 1.000 | 49040x | 15.553 us | 52.65% | 10.197 us | 1.85% | 6.427G | 102.834 GB/s | 5.04% | | F64 | I32 | false | 2^20 = 1048576 | 1.000 | 27264x | 23.660 us | 29.06% | 18.346 us | 1.37% | 57.156G | 914.489 GB/s | 44.85% | | F64 | I32 | false | 2^24 = 16777216 | 1.000 | 3024x | 170.989 us | 3.88% | 165.419 us | 1.90% | 101.422G | 1.623 TB/s | 79.58% | | F64 | I32 | false | 2^28 = 268435456 | 1.000 | 544x | 2.562 ms | 0.57% | 2.557 ms | 0.52% | 104.995G | 1.680 TB/s | 82.39% | | F64 | I32 | false | 2^16 = 65536 | 0.544 | 49248x | 15.624 us | 54.02% | 10.153 us | 2.18% | 6.455G | 56.263 GB/s | 2.76% | | F64 | I32 | false | 2^20 = 1048576 | 0.544 | 29584x | 22.160 us | 31.17% | 16.904 us | 1.17% | 62.033G | 540.041 GB/s | 26.49% | | F64 | I32 | false | 2^24 = 16777216 | 0.544 | 4656x | 113.189 us | 5.73% | 107.603 us | 2.41% | 155.918G | 1.357 TB/s | 66.56% | | F64 | I32 | false | 2^28 = 268435456 | 0.544 | 1712x | 1.471 ms | 0.92% | 1.465 ms | 0.83% | 183.211G | 1.595 TB/s | 78.21% | | F64 | I32 | false | 2^16 = 65536 | 0.000 | 50128x | 15.448 us | 54.96% | 9.976 us | 1.61% | 6.570G | 52.557 GB/s | 2.58% | | F64 | I32 | false | 2^20 = 1048576 | 0.000 | 30592x | 21.597 us | 32.20% | 16.348 us | 1.49% | 64.143G | 513.141 GB/s | 25.17% | | F64 | I32 | false | 2^24 = 16777216 | 0.000 | 5136x | 103.109 us | 6.01% | 97.513 us | 1.79% | 172.050G | 1.376 TB/s | 67.50% | | F64 | I32 | false | 2^28 = 268435456 | 0.000 | 2496x | 1.234 ms | 0.99% | 1.228 ms | 0.88% | 218.610G | 1.749 TB/s | 85.77% | | F64 | U32 | false | 2^16 = 65536 | 1.000 | 48144x | 15.809 us | 52.31% | 10.387 us | 2.39% | 6.309G | 100.948 GB/s | 4.95% | | F64 | U32 | false | 2^20 = 1048576 | 1.000 | 27568x | 23.406 us | 29.11% | 18.145 us | 1.81% | 57.789G | 924.620 GB/s | 45.35% | | F64 | U32 | false | 2^24 = 16777216 | 1.000 | 3024x | 170.820 us | 3.85% | 165.357 us | 1.95% | 101.460G | 1.623 TB/s | 79.61% | | F64 | U32 | false | 2^28 = 268435456 | 1.000 | 576x | 2.558 ms | 0.57% | 2.552 ms | 0.52% | 105.166G | 1.683 TB/s | 82.52% | | F64 | U32 | false | 2^16 = 65536 | 0.544 | 51600x | 14.976 us | 54.69% | 9.691 us | 2.83% | 6.763G | 58.945 GB/s | 2.89% | | F64 | U32 | false | 2^20 = 1048576 | 0.544 | 29248x | 22.510 us | 31.76% | 17.098 us | 1.27% | 61.327G | 533.893 GB/s | 26.18% | | F64 | U32 | false | 2^24 = 16777216 | 0.544 | 4672x | 112.577 us | 5.62% | 107.126 us | 2.38% | 156.611G | 1.363 TB/s | 66.86% | | F64 | U32 | false | 2^28 = 268435456 | 0.544 | 1856x | 1.466 ms | 0.91% | 1.460 ms | 0.82% | 183.802G | 1.600 TB/s | 78.46% | | F64 | U32 | false | 2^16 = 65536 | 0.000 | 51920x | 14.900 us | 54.81% | 9.633 us | 1.64% | 6.804G | 54.429 GB/s | 2.67% | | F64 | U32 | false | 2^20 = 1048576 | 0.000 | 29872x | 22.167 us | 32.43% | 16.747 us | 1.70% | 62.613G | 500.900 GB/s | 24.57% | | F64 | U32 | false | 2^24 = 16777216 | 0.000 | 5136x | 102.955 us | 5.89% | 97.519 us | 1.85% | 172.041G | 1.376 TB/s | 67.50% | | F64 | U32 | false | 2^28 = 268435456 | 0.000 | 2512x | 1.234 ms | 0.99% | 1.228 ms | 0.87% | 218.513G | 1.748 TB/s | 85.73% | | F64 | I64 | false | 2^16 = 65536 | 1.000 | 48032x | 15.692 us | 50.84% | 10.413 us | 2.88% | 6.294G | 100.702 GB/s | 4.94% | | F64 | I64 | false | 2^20 = 1048576 | 1.000 | 25568x | 24.966 us | 27.79% | 19.566 us | 3.01% | 53.591G | 857.457 GB/s | 42.05% | | F64 | I64 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.446 us | 3.58% | 165.995 us | 1.42% | 101.071G | 1.617 TB/s | 79.31% | | F64 | I64 | false | 2^28 = 268435456 | 1.000 | 197x | 2.545 ms | 0.47% | 2.539 ms | 0.41% | 105.734G | 1.692 TB/s | 82.97% | | F64 | I64 | false | 2^16 = 65536 | 0.544 | 50048x | 15.298 us | 53.25% | 9.992 us | 2.85% | 6.559G | 57.168 GB/s | 2.80% | | F64 | I64 | false | 2^20 = 1048576 | 0.544 | 28256x | 23.157 us | 30.98% | 17.697 us | 1.98% | 59.253G | 515.842 GB/s | 25.30% | | F64 | I64 | false | 2^24 = 16777216 | 0.544 | 4560x | 115.618 us | 5.20% | 110.023 us | 1.07% | 152.488G | 1.327 TB/s | 65.10% | | F64 | I64 | false | 2^28 = 268435456 | 0.544 | 2208x | 1.546 ms | 0.70% | 1.541 ms | 0.59% | 174.245G | 1.517 TB/s | 74.38% | | F64 | I64 | false | 2^16 = 65536 | 0.000 | 49184x | 15.646 us | 54.00% | 10.167 us | 2.48% | 6.446G | 51.567 GB/s | 2.53% | | F64 | I64 | false | 2^20 = 1048576 | 0.000 | 28944x | 22.590 us | 30.83% | 17.280 us | 1.76% | 60.682G | 485.460 GB/s | 23.81% | | F64 | I64 | false | 2^24 = 16777216 | 0.000 | 4864x | 108.662 us | 5.53% | 103.069 us | 0.97% | 162.777G | 1.302 TB/s | 63.86% | | F64 | I64 | false | 2^28 = 268435456 | 0.000 | 2864x | 1.305 ms | 0.84% | 1.300 ms | 0.72% | 206.521G | 1.652 TB/s | 81.03% | | F64 | U64 | false | 2^16 = 65536 | 1.000 | 46656x | 16.209 us | 51.31% | 10.718 us | 1.95% | 6.114G | 97.831 GB/s | 4.80% | | F64 | U64 | false | 2^20 = 1048576 | 1.000 | 25920x | 24.553 us | 27.46% | 19.298 us | 3.16% | 54.337G | 869.386 GB/s | 42.64% | | F64 | U64 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.755 us | 4.47% | 166.001 us | 1.37% | 101.067G | 1.617 TB/s | 79.31% | | F64 | U64 | false | 2^28 = 268435456 | 1.000 | 197x | 2.546 ms | 0.50% | 2.540 ms | 0.44% | 105.670G | 1.691 TB/s | 82.92% | | F64 | U64 | false | 2^16 = 65536 | 0.544 | 49088x | 15.658 us | 53.81% | 10.189 us | 2.93% | 6.432G | 56.065 GB/s | 2.75% | | F64 | U64 | false | 2^20 = 1048576 | 0.544 | 28544x | 22.826 us | 30.35% | 17.524 us | 1.86% | 59.835G | 520.912 GB/s | 25.55% | | F64 | U64 | false | 2^24 = 16777216 | 0.544 | 4560x | 115.524 us | 5.22% | 109.922 us | 1.08% | 152.629G | 1.329 TB/s | 65.16% | | F64 | U64 | false | 2^28 = 268435456 | 0.544 | 2240x | 1.546 ms | 0.69% | 1.541 ms | 0.59% | 174.241G | 1.517 TB/s | 74.38% | | F64 | U64 | false | 2^16 = 65536 | 0.000 | 50912x | 15.179 us | 54.64% | 9.824 us | 2.12% | 6.671G | 53.371 GB/s | 2.62% | | F64 | U64 | false | 2^20 = 1048576 | 0.000 | 29312x | 22.198 us | 30.25% | 17.060 us | 2.07% | 61.463G | 491.705 GB/s | 24.11% | | F64 | U64 | false | 2^24 = 16777216 | 0.000 | 4880x | 107.820 us | 5.27% | 102.514 us | 0.95% | 163.657G | 1.309 TB/s | 64.21% | | F64 | U64 | false | 2^28 = 268435456 | 0.000 | 2848x | 1.305 ms | 0.87% | 1.300 ms | 0.76% | 206.561G | 1.652 TB/s | 81.04% |
Performance numbers relative to baseline (original main) | T{ct} | OffsetT{ct} | IsInPlace{ct} | Elements{io} | Entropy | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff | Status | |---------|---------------|-----------------|----------------|-----------|------------|-------------|------------|-------------|-------------|---------|----------| | I8 | I32 | false | 2^16 | 1 | 9.701 us | 2.06% | 9.761 us | 2.25% | 0.061 us | 0.62% | PASS | | I8 | I32 | false | 2^20 | 1 | 12.331 us | 1.86% | 12.308 us | 1.90% | -0.024 us | -0.19% | PASS | | I8 | I32 | false | 2^24 | 1 | 51.762 us | 1.28% | 51.803 us | 1.28% | 0.041 us | 0.08% | PASS | | I8 | I32 | false | 2^28 | 1 | 697.274 us | 0.50% | 697.287 us | 0.50% | 0.013 us | 0.00% | PASS | | I8 | I32 | false | 2^16 | 0.544 | 9.852 us | 2.38% | 9.809 us | 2.29% | -0.043 us | -0.44% | PASS | | I8 | I32 | false | 2^20 | 0.544 | 11.947 us | 1.91% | 11.902 us | 2.06% | -0.046 us | -0.38% | PASS | | I8 | I32 | false | 2^24 | 0.544 | 48.573 us | 1.16% | 48.468 us | 1.10% | -0.105 us | -0.22% | PASS | | I8 | I32 | false | 2^28 | 0.544 | 642.183 us | 0.48% | 642.356 us | 0.50% | 0.173 us | 0.03% | PASS | | I8 | I32 | false | 2^16 | 0 | 9.760 us | 2.53% | 9.599 us | 2.23% | -0.161 us | -1.65% | PASS | | I8 | I32 | false | 2^20 | 0 | 11.512 us | 2.08% | 11.521 us | 2.11% | 0.009 us | 0.08% | PASS | | I8 | I32 | false | 2^24 | 0 | 42.578 us | 1.21% | 42.429 us | 1.14% | -0.149 us | -0.35% | PASS | | I8 | I32 | false | 2^28 | 0 | 527.477 us | 0.50% | 527.489 us | 0.50% | 0.012 us | 0.00% | PASS | | I8 | U32 | false | 2^16 | 1 | 9.631 us | 2.08% | 9.574 us | 2.14% | -0.057 us | -0.59% | PASS | | I8 | U32 | false | 2^20 | 1 | 12.239 us | 1.79% | 12.235 us | 1.98% | -0.005 us | -0.04% | PASS | | I8 | U32 | false | 2^24 | 1 | 50.466 us | 1.28% | 50.175 us | 1.23% | -0.291 us | -0.58% | PASS | | I8 | U32 | false | 2^28 | 1 | 678.418 us | 0.50% | 678.384 us | 0.50% | -0.034 us | -0.01% | PASS | | I8 | U32 | false | 2^16 | 0.544 | 9.450 us | 2.15% | 9.572 us | 2.30% | 0.122 us | 1.29% | PASS | | I8 | U32 | false | 2^20 | 0.544 | 12.092 us | 1.88% | 12.038 us | 2.06% | -0.054 us | -0.45% | PASS | | I8 | U32 | false | 2^24 | 0.544 | 48.070 us | 1.09% | 48.022 us | 1.09% | -0.048 us | -0.10% | PASS | | I8 | U32 | false | 2^28 | 0.544 | 636.864 us | 0.50% | 636.531 us | 0.50% | -0.333 us | -0.05% | PASS | | I8 | U32 | false | 2^16 | 0 | 9.291 us | 2.31% | 9.371 us | 2.29% | 0.080 us | 0.86% | PASS | | I8 | U32 | false | 2^20 | 0 | 11.793 us | 1.78% | 11.773 us | 1.80% | -0.020 us | -0.17% | PASS | | I8 | U32 | false | 2^24 | 0 | 42.392 us | 1.12% | 42.373 us | 1.14% | -0.019 us | -0.04% | PASS | | I8 | U32 | false | 2^28 | 0 | 525.763 us | 0.50% | 525.709 us | 0.50% | -0.054 us | -0.01% | PASS | | I8 | I64 | false | 2^16 | 1 | 9.437 us | 2.11% | 9.433 us | 2.32% | -0.004 us | -0.04% | PASS | | I8 | I64 | false | 2^20 | 1 | 13.216 us | 1.50% | 12.202 us | 1.58% | -1.014 us | -7.67% | FAIL | | I8 | I64 | false | 2^24 | 1 | 71.274 us | 0.53% | 62.936 us | 0.57% | -8.338 us | -11.70% | FAIL | | I8 | I64 | false | 2^28 | 1 | 1.036 ms | 0.29% | 880.366 us | 0.20% | -155.973 us | -15.05% | FAIL | | I8 | I64 | false | 2^16 | 0.544 | 9.298 us | 2.02% | 9.363 us | 2.02% | 0.065 us | 0.70% | PASS | | I8 | I64 | false | 2^20 | 0.544 | 13.090 us | 1.59% | 11.916 us | 1.92% | -1.175 us | -8.97% | FAIL | | I8 | I64 | false | 2^24 | 0.544 | 68.409 us | 0.52% | 59.705 us | 0.60% | -8.704 us | -12.72% | FAIL | | I8 | I64 | false | 2^28 | 0.544 | 985.480 us | 0.32% | 816.964 us | 0.28% | -168.516 us | -17.10% | FAIL | | I8 | I64 | false | 2^16 | 0 | 9.275 us | 2.17% | 9.282 us | 1.87% | 0.006 us | 0.07% | PASS | | I8 | I64 | false | 2^20 | 0 | 12.841 us | 1.53% | 11.927 us | 2.54% | -0.915 us | -7.12% | FAIL | | I8 | I64 | false | 2^24 | 0 | 63.570 us | 0.53% | 54.134 us | 0.64% | -9.437 us | -14.84% | FAIL | | I8 | I64 | false | 2^28 | 0 | 883.478 us | 0.43% | 711.734 us | 0.38% | -171.744 us | -19.44% | FAIL | | I8 | U64 | false | 2^16 | 1 | 9.439 us | 1.96% | 9.764 us | 1.71% | 0.324 us | 3.44% | FAIL | | I8 | U64 | false | 2^20 | 1 | 13.214 us | 1.46% | 12.488 us | 1.98% | -0.726 us | -5.49% | FAIL | | I8 | U64 | false | 2^24 | 1 | 71.411 us | 0.51% | 63.348 us | 0.56% | -8.063 us | -11.29% | FAIL | | I8 | U64 | false | 2^28 | 1 | 1.036 ms | 0.30% | 880.735 us | 0.21% | -155.567 us | -15.01% | FAIL | | I8 | U64 | false | 2^16 | 0.544 | 9.563 us | 1.88% | 9.849 us | 1.60% | 0.286 us | 2.99% | FAIL | | I8 | U64 | false | 2^20 | 0.544 | 13.068 us | 1.47% | 12.031 us | 2.27% | -1.037 us | -7.94% | FAIL | | I8 | U64 | false | 2^24 | 0.544 | 68.605 us | 0.50% | 59.988 us | 0.65% | -8.616 us | -12.56% | FAIL | | I8 | U64 | false | 2^28 | 0.544 | 985.166 us | 0.31% | 817.292 us | 0.27% | -167.874 us | -17.04% | FAIL | | I8 | U64 | false | 2^16 | 0 | 9.615 us | 2.21% | 9.840 us | 1.80% | 0.226 us | 2.35% | FAIL | | I8 | U64 | false | 2^20 | 0 | 12.748 us | 1.70% | 11.743 us | 2.58% | -1.005 us | -7.88% | FAIL | | I8 | U64 | false | 2^24 | 0 | 63.636 us | 0.53% | 54.085 us | 0.74% | -9.551 us | -15.01% | FAIL | | I8 | U64 | false | 2^28 | 0 | 883.465 us | 0.42% | 711.693 us | 0.38% | -171.772 us | -19.44% | FAIL | | I16 | I32 | false | 2^16 | 1 | 9.906 us | 2.29% | 10.225 us | 2.05% | 0.319 us | 3.22% | FAIL | | I16 | I32 | false | 2^20 | 1 | 13.072 us | 1.69% | 13.317 us | 1.82% | 0.245 us | 1.87% | FAIL | | I16 | I32 | false | 2^24 | 1 | 62.885 us | 3.11% | 63.162 us | 3.07% | 0.276 us | 0.44% | PASS | | I16 | I32 | false | 2^28 | 1 | 875.434 us | 1.42% | 876.298 us | 1.42% | 0.864 us | 0.10% | PASS | | I16 | I32 | false | 2^16 | 0.544 | 9.973 us | 1.78% | 10.258 us | 1.66% | 0.284 us | 2.85% | FAIL | | I16 | I32 | false | 2^20 | 0.544 | 13.017 us | 1.79% | 13.444 us | 1.71% | 0.427 us | 3.28% | FAIL | | I16 | I32 | false | 2^24 | 0.544 | 55.748 us | 2.85% | 56.111 us | 2.89% | 0.363 us | 0.65% | PASS | | I16 | I32 | false | 2^28 | 0.544 | 701.370 us | 1.52% | 701.473 us | 1.44% | 0.103 us | 0.01% | PASS | | I16 | I32 | false | 2^16 | 0 | 9.829 us | 2.28% | 10.142 us | 2.67% | 0.314 us | 3.19% | FAIL | | I16 | I32 | false | 2^20 | 0 | 12.486 us | 1.69% | 12.718 us | 2.01% | 0.232 us | 1.86% | FAIL | | I16 | I32 | false | 2^24 | 0 | 47.503 us | 2.56% | 47.840 us | 2.53% | 0.337 us | 0.71% | PASS | | I16 | I32 | false | 2^28 | 0 | 510.375 us | 1.03% | 510.817 us | 0.99% | 0.442 us | 0.09% | PASS | | I16 | U32 | false | 2^16 | 1 | 9.868 us | 2.22% | 10.097 us | 2.50% | 0.229 us | 2.32% | FAIL | | I16 | U32 | false | 2^20 | 1 | 13.090 us | 1.58% | 13.281 us | 1.47% | 0.191 us | 1.46% | PASS | | I16 | U32 | false | 2^24 | 1 | 62.391 us | 3.13% | 62.790 us | 3.10% | 0.400 us | 0.64% | PASS | | I16 | U32 | false | 2^28 | 1 | 869.818 us | 1.48% | 870.367 us | 1.41% | 0.549 us | 0.06% | PASS | | I16 | U32 | false | 2^16 | 0.544 | 10.012 us | 2.08% | 10.254 us | 2.09% | 0.242 us | 2.42% | FAIL | | I16 | U32 | false | 2^20 | 0.544 | 13.048 us | 1.88% | 13.429 us | 2.23% | 0.381 us | 2.92% | FAIL | | I16 | U32 | false | 2^24 | 0.544 | 55.571 us | 2.84% | 55.857 us | 2.82% | 0.286 us | 0.52% | PASS | | I16 | U32 | false | 2^28 | 0.544 | 698.082 us | 1.43% | 698.129 us | 1.41% | 0.048 us | 0.01% | PASS | | I16 | U32 | false | 2^16 | 0 | 9.872 us | 2.11% | 10.144 us | 2.78% | 0.272 us | 2.75% | FAIL | | I16 | U32 | false | 2^20 | 0 | 12.505 us | 1.77% | 12.786 us | 1.67% | 0.282 us | 2.25% | FAIL | | I16 | U32 | false | 2^24 | 0 | 47.552 us | 2.53% | 47.768 us | 2.59% | 0.216 us | 0.45% | PASS | | I16 | U32 | false | 2^28 | 0 | 509.648 us | 0.97% | 510.231 us | 0.99% | 0.583 us | 0.11% | PASS | | I16 | I64 | false | 2^16 | 1 | 9.535 us | 2.07% | 9.877 us | 2.30% | 0.341 us | 3.58% | FAIL | | I16 | I64 | false | 2^20 | 1 | 13.857 us | 1.42% | 13.568 us | 1.50% | -0.288 us | -2.08% | FAIL | | I16 | I64 | false | 2^24 | 1 | 77.764 us | 0.74% | 64.672 us | 1.81% | -13.092 us | -16.84% | FAIL | | I16 | I64 | false | 2^28 | 1 | 1.105 ms | 0.50% | 850.004 us | 0.84% | -255.007 us | -23.08% | FAIL | | I16 | I64 | false | 2^16 | 0.544 | 9.982 us | 2.47% | 9.620 us | 1.92% | -0.362 us | -3.63% | FAIL | | I16 | I64 | false | 2^20 | 0.544 | 14.008 us | 1.94% | 13.212 us | 1.34% | -0.796 us | -5.68% | FAIL | | I16 | I64 | false | 2^24 | 0.544 | 75.555 us | 0.68% | 58.392 us | 1.66% | -17.163 us | -22.72% | FAIL | | I16 | I64 | false | 2^28 | 0.544 | 1.071 ms | 0.40% | 741.448 us | 0.61% | -329.215 us | -30.75% | FAIL | | I16 | I64 | false | 2^16 | 0 | 9.732 us | 1.93% | 9.472 us | 2.10% | -0.260 us | -2.67% | FAIL | | I16 | I64 | false | 2^20 | 0 | 13.756 us | 1.37% | 12.810 us | 1.47% | -0.945 us | -6.87% | FAIL | | I16 | I64 | false | 2^24 | 0 | 68.441 us | 0.69% | 51.584 us | 1.55% | -16.857 us | -24.63% | FAIL | | I16 | I64 | false | 2^28 | 0 | 930.454 us | 0.50% | 607.408 us | 0.55% | -323.046 us | -34.72% | FAIL | | I16 | U64 | false | 2^16 | 1 | 10.050 us | 2.25% | 9.916 us | 1.97% | -0.134 us | -1.33% | PASS | | I16 | U64 | false | 2^20 | 1 | 14.095 us | 1.75% | 13.100 us | 1.64% | -0.995 us | -7.06% | FAIL | | I16 | U64 | false | 2^24 | 1 | 78.038 us | 0.70% | 64.267 us | 1.85% | -13.771 us | -17.65% | FAIL | | I16 | U64 | false | 2^28 | 1 | 1.106 ms | 0.50% | 850.115 us | 0.88% | -255.472 us | -23.11% | FAIL | | I16 | U64 | false | 2^16 | 0.544 | 9.991 us | 2.50% | 9.814 us | 2.00% | -0.177 us | -1.77% | PASS | | I16 | U64 | false | 2^20 | 0.544 | 14.041 us | 1.64% | 12.892 us | 1.58% | -1.149 us | -8.18% | FAIL | | I16 | U64 | false | 2^24 | 0.544 | 75.371 us | 0.68% | 58.437 us | 1.66% | -16.935 us | -22.47% | FAIL | | I16 | U64 | false | 2^28 | 0.544 | 1.070 ms | 0.38% | 741.380 us | 0.60% | -329.076 us | -30.74% | FAIL | | I16 | U64 | false | 2^16 | 0 | 9.929 us | 2.79% | 9.742 us | 2.03% | -0.187 us | -1.89% | PASS | | I16 | U64 | false | 2^20 | 0 | 13.692 us | 1.26% | 12.548 us | 1.82% | -1.144 us | -8.35% | FAIL | | I16 | U64 | false | 2^24 | 0 | 68.324 us | 0.74% | 51.592 us | 1.55% | -16.732 us | -24.49% | FAIL | | I16 | U64 | false | 2^28 | 0 | 930.468 us | 0.50% | 607.502 us | 0.53% | -322.967 us | -34.71% | FAIL | | I32 | I32 | false | 2^16 | 1 | 10.432 us | 2.46% | 10.446 us | 1.99% | 0.015 us | 0.14% | PASS | | I32 | I32 | false | 2^20 | 1 | 14.747 us | 1.56% | 14.782 us | 1.57% | 0.035 us | 0.24% | PASS | | I32 | I32 | false | 2^24 | 1 | 90.114 us | 2.58% | 90.207 us | 2.63% | 0.093 us | 0.10% | PASS | | I32 | I32 | false | 2^28 | 1 | 1.343 ms | 0.96% | 1.343 ms | 0.99% | 0.056 us | 0.00% | PASS | | I32 | I32 | false | 2^16 | 0.544 | 9.790 us | 2.11% | 9.976 us | 2.12% | 0.186 us | 1.90% | PASS | | I32 | I32 | false | 2^20 | 0.544 | 14.614 us | 1.86% | 14.744 us | 1.69% | 0.130 us | 0.89% | PASS | | I32 | I32 | false | 2^24 | 0.544 | 78.126 us | 2.71% | 78.023 us | 2.70% | -0.103 us | -0.13% | PASS | | I32 | I32 | false | 2^28 | 0.544 | 1.072 ms | 1.10% | 1.072 ms | 1.14% | -0.143 us | -0.01% | PASS | | I32 | I32 | false | 2^16 | 0 | 9.574 us | 1.98% | 9.777 us | 1.68% | 0.203 us | 2.12% | FAIL | | I32 | I32 | false | 2^20 | 0 | 14.174 us | 1.49% | 13.948 us | 1.72% | -0.226 us | -1.60% | FAIL | | I32 | I32 | false | 2^24 | 0 | 62.279 us | 2.05% | 62.353 us | 2.01% | 0.074 us | 0.12% | PASS | | I32 | I32 | false | 2^28 | 0 | 648.978 us | 1.43% | 649.132 us | 1.46% | 0.154 us | 0.02% | PASS | | I32 | U32 | false | 2^16 | 1 | 9.885 us | 2.03% | 9.880 us | 2.05% | -0.005 us | -0.05% | PASS | | I32 | U32 | false | 2^20 | 1 | 14.989 us | 1.38% | 14.916 us | 1.50% | -0.073 us | -0.49% | PASS | | I32 | U32 | false | 2^24 | 1 | 90.135 us | 2.63% | 90.100 us | 2.67% | -0.035 us | -0.04% | PASS | | I32 | U32 | false | 2^28 | 1 | 1.341 ms | 1.01% | 1.342 ms | 0.97% | 0.923 us | 0.07% | PASS | | I32 | U32 | false | 2^16 | 0.544 | 9.762 us | 1.97% | 9.829 us | 1.92% | 0.067 us | 0.68% | PASS | | I32 | U32 | false | 2^20 | 0.544 | 14.591 us | 1.65% | 14.687 us | 1.78% | 0.097 us | 0.66% | PASS | | I32 | U32 | false | 2^24 | 0.544 | 78.146 us | 2.70% | 78.080 us | 2.65% | -0.066 us | -0.08% | PASS | | I32 | U32 | false | 2^28 | 0.544 | 1.071 ms | 1.16% | 1.070 ms | 1.14% | -0.435 us | -0.04% | PASS | | I32 | U32 | false | 2^16 | 0 | 9.528 us | 2.09% | 9.598 us | 2.16% | 0.071 us | 0.74% | PASS | | I32 | U32 | false | 2^20 | 0 | 14.019 us | 1.81% | 13.948 us | 1.40% | -0.071 us | -0.51% | PASS | | I32 | U32 | false | 2^24 | 0 | 62.292 us | 2.03% | 62.565 us | 2.02% | 0.273 us | 0.44% | PASS | | I32 | U32 | false | 2^28 | 0 | 648.977 us | 1.44% | 648.897 us | 1.44% | -0.080 us | -0.01% | PASS | | I32 | I64 | false | 2^16 | 1 | 9.736 us | 1.69% | 10.245 us | 1.59% | 0.509 us | 5.23% | FAIL | | I32 | I64 | false | 2^20 | 1 | 15.399 us | 1.36% | 15.483 us | 1.73% | 0.084 us | 0.55% | PASS | | I32 | I64 | false | 2^24 | 1 | 101.096 us | 1.42% | 93.141 us | 1.55% | -7.956 us | -7.87% | FAIL | | I32 | I64 | false | 2^28 | 1 | 1.500 ms | 0.61% | 1.354 ms | 0.69% | -146.385 us | -9.76% | FAIL | | I32 | I64 | false | 2^16 | 0.544 | 9.759 us | 2.36% | 10.119 us | 1.54% | 0.360 us | 3.69% | FAIL | | I32 | I64 | false | 2^20 | 0.544 | 15.137 us | 1.44% | 15.171 us | 1.79% | 0.034 us | 0.23% | PASS | | I32 | I64 | false | 2^24 | 0.544 | 90.179 us | 1.04% | 82.308 us | 1.27% | -7.871 us | -8.73% | FAIL | | I32 | I64 | false | 2^28 | 0.544 | 1.246 ms | 0.55% | 1.133 ms | 0.66% | -112.676 us | -9.04% | FAIL | | I32 | I64 | false | 2^16 | 0 | 9.590 us | 2.34% | 10.123 us | 2.50% | 0.534 us | 5.56% | FAIL | | I32 | I64 | false | 2^20 | 0 | 14.721 us | 1.63% | 14.119 us | 1.33% | -0.602 us | -4.09% | FAIL | | I32 | I64 | false | 2^24 | 0 | 77.380 us | 0.86% | 69.230 us | 0.96% | -8.150 us | -10.53% | FAIL | | I32 | I64 | false | 2^28 | 0 | 977.917 us | 0.50% | 788.107 us | 0.82% | -189.810 us | -19.41% | FAIL | | I32 | U64 | false | 2^16 | 1 | 9.927 us | 2.24% | 10.512 us | 2.60% | 0.585 us | 5.89% | FAIL | | I32 | U64 | false | 2^20 | 1 | 15.282 us | 1.29% | 15.285 us | 1.97% | 0.003 us | 0.02% | PASS | | I32 | U64 | false | 2^24 | 1 | 100.992 us | 1.38% | 93.131 us | 1.58% | -7.861 us | -7.78% | FAIL | | I32 | U64 | false | 2^28 | 1 | 1.501 ms | 0.63% | 1.354 ms | 0.69% | -146.852 us | -9.79% | FAIL | | I32 | U64 | false | 2^16 | 0.544 | 9.996 us | 2.43% | 10.472 us | 2.08% | 0.476 us | 4.76% | FAIL | | I32 | U64 | false | 2^20 | 0.544 | 15.027 us | 1.42% | 14.807 us | 1.51% | -0.221 us | -1.47% | FAIL | | I32 | U64 | false | 2^24 | 0.544 | 90.088 us | 1.05% | 82.310 us | 1.25% | -7.777 us | -8.63% | FAIL | | I32 | U64 | false | 2^28 | 0.544 | 1.246 ms | 0.56% | 1.133 ms | 0.68% | -112.464 us | -9.03% | FAIL | | I32 | U64 | false | 2^16 | 0 | 9.946 us | 2.59% | 9.795 us | 2.10% | -0.151 us | -1.52% | PASS | | I32 | U64 | false | 2^20 | 0 | 15.075 us | 1.40% | 13.950 us | 1.37% | -1.125 us | -7.46% | FAIL | | I32 | U64 | false | 2^24 | 0 | 77.620 us | 0.82% | 68.756 us | 0.97% | -8.864 us | -11.42% | FAIL | | I32 | U64 | false | 2^28 | 0 | 978.208 us | 0.50% | 788.785 us | 0.81% | -189.423 us | -19.36% | FAIL | | I64 | I32 | false | 2^16 | 1 | 10.507 us | 2.17% | 10.289 us | 1.85% | -0.217 us | -2.07% | FAIL | | I64 | I32 | false | 2^20 | 1 | 18.305 us | 1.46% | 18.028 us | 1.35% | -0.277 us | -1.51% | FAIL | | I64 | I32 | false | 2^24 | 1 | 165.740 us | 1.90% | 165.528 us | 1.92% | -0.212 us | -0.13% | PASS | | I64 | I32 | false | 2^28 | 1 | 2.559 ms | 0.53% | 2.559 ms | 0.53% | -0.113 us | -0.00% | PASS | | I64 | I32 | false | 2^16 | 0.544 | 10.505 us | 2.00% | 10.211 us | 2.13% | -0.294 us | -2.80% | FAIL | | I64 | I32 | false | 2^20 | 0.544 | 17.958 us | 1.71% | 17.651 us | 1.28% | -0.307 us | -1.71% | FAIL | | I64 | I32 | false | 2^24 | 0.544 | 138.502 us | 2.55% | 138.176 us | 2.58% | -0.326 us | -0.24% | PASS | | I64 | I32 | false | 2^28 | 0.544 | 2.026 ms | 0.80% | 2.025 ms | 0.76% | -0.976 us | -0.05% | PASS | | I64 | I32 | false | 2^16 | 0 | 10.096 us | 2.16% | 9.702 us | 2.03% | -0.394 us | -3.90% | FAIL | | I64 | I32 | false | 2^20 | 0 | 16.653 us | 1.55% | 16.361 us | 1.28% | -0.292 us | -1.75% | FAIL | | I64 | I32 | false | 2^24 | 0 | 97.615 us | 1.77% | 97.269 us | 1.83% | -0.346 us | -0.35% | PASS | | I64 | I32 | false | 2^28 | 0 | 1.231 ms | 0.89% | 1.231 ms | 0.86% | -0.325 us | -0.03% | PASS | | I64 | U32 | false | 2^16 | 1 | 10.444 us | 2.16% | 10.303 us | 1.99% | -0.140 us | -1.34% | PASS | | I64 | U32 | false | 2^20 | 1 | 18.452 us | 1.34% | 17.992 us | 1.32% | -0.461 us | -2.50% | FAIL | | I64 | U32 | false | 2^24 | 1 | 165.700 us | 1.90% | 165.426 us | 1.89% | -0.274 us | -0.17% | PASS | | I64 | U32 | false | 2^28 | 1 | 2.557 ms | 0.49% | 2.556 ms | 0.47% | -1.125 us | -0.04% | PASS | | I64 | U32 | false | 2^16 | 0.544 | 10.465 us | 1.98% | 10.070 us | 1.76% | -0.395 us | -3.77% | FAIL | | I64 | U32 | false | 2^20 | 0.544 | 18.293 us | 1.56% | 18.264 us | 1.32% | -0.029 us | -0.16% | PASS | | I64 | U32 | false | 2^24 | 0.544 | 138.655 us | 2.54% | 138.492 us | 2.54% | -0.163 us | -0.12% | PASS | | I64 | U32 | false | 2^28 | 0.544 | 2.027 ms | 0.79% | 2.028 ms | 0.80% | 1.374 us | 0.07% | PASS | | I64 | U32 | false | 2^16 | 0 | 9.823 us | 2.63% | 9.818 us | 2.17% | -0.004 us | -0.05% | PASS | | I64 | U32 | false | 2^20 | 0 | 16.838 us | 1.49% | 16.774 us | 1.55% | -0.063 us | -0.38% | PASS | | I64 | U32 | false | 2^24 | 0 | 97.698 us | 1.87% | 97.699 us | 1.83% | 0.000 us | 0.00% | PASS | | I64 | U32 | false | 2^28 | 0 | 1.231 ms | 0.88% | 1.232 ms | 0.91% | 0.069 us | 0.01% | PASS | | I64 | I64 | false | 2^16 | 1 | 10.236 us | 2.07% | 10.687 us | 2.00% | 0.451 us | 4.41% | FAIL | | I64 | I64 | false | 2^20 | 1 | 19.773 us | 2.29% | 19.573 us | 2.88% | -0.200 us | -1.01% | PASS | | I64 | I64 | false | 2^24 | 1 | 182.261 us | 1.49% | 166.465 us | 1.39% | -15.796 us | -8.67% | FAIL | | I64 | I64 | false | 2^28 | 1 | 2.825 ms | 0.41% | 2.547 ms | 0.39% | -277.915 us | -9.84% | FAIL | | I64 | I64 | false | 2^16 | 0.544 | 10.275 us | 1.62% | 10.587 us | 1.80% | 0.312 us | 3.04% | FAIL | | I64 | I64 | false | 2^20 | 0.544 | 19.681 us | 2.06% | 19.058 us | 2.17% | -0.623 us | -3.17% | FAIL | | I64 | I64 | false | 2^24 | 0.544 | 153.301 us | 1.28% | 139.905 us | 1.48% | -13.395 us | -8.74% | FAIL | | I64 | I64 | false | 2^28 | 0.544 | 2.285 ms | 0.39% | 2.074 ms | 0.50% | -211.636 us | -9.26% | FAIL | | I64 | I64 | false | 2^16 | 0 | 10.173 us | 2.83% | 9.846 us | 2.23% | -0.327 us | -3.21% | FAIL | | I64 | I64 | false | 2^20 | 0 | 19.002 us | 2.08% | 17.616 us | 1.90% | -1.386 us | -7.29% | FAIL | | I64 | I64 | false | 2^24 | 0 | 122.660 us | 0.84% | 104.476 us | 0.93% | -18.184 us | -14.82% | FAIL | | I64 | I64 | false | 2^28 | 0 | 1.627 ms | 0.50% | 1.332 ms | 0.66% | -295.460 us | -18.16% | FAIL | | I64 | U64 | false | 2^16 | 1 | 10.397 us | 2.18% | 10.521 us | 2.03% | 0.124 us | 1.19% | PASS | | I64 | U64 | false | 2^20 | 1 | 19.960 us | 2.08% | 19.801 us | 2.98% | -0.159 us | -0.80% | PASS | | I64 | U64 | false | 2^24 | 1 | 182.402 us | 1.50% | 166.210 us | 1.38% | -16.191 us | -8.88% | FAIL | | I64 | U64 | false | 2^28 | 1 | 2.825 ms | 0.38% | 2.547 ms | 0.42% | -278.520 us | -9.86% | FAIL | | I64 | U64 | false | 2^16 | 0.544 | 10.278 us | 2.91% | 10.222 us | 2.19% | -0.056 us | -0.54% | PASS | | I64 | U64 | false | 2^20 | 0.544 | 19.654 us | 2.15% | 18.897 us | 2.14% | -0.757 us | -3.85% | FAIL | | I64 | U64 | false | 2^24 | 0.544 | 153.247 us | 1.25% | 139.701 us | 1.53% | -13.546 us | -8.84% | FAIL | | I64 | U64 | false | 2^28 | 0.544 | 2.286 ms | 0.40% | 2.072 ms | 0.50% | -213.562 us | -9.34% | FAIL | | I64 | U64 | false | 2^16 | 0 | 10.483 us | 1.59% | 9.945 us | 1.68% | -0.538 us | -5.13% | FAIL | | I64 | U64 | false | 2^20 | 0 | 19.084 us | 1.91% | 17.128 us | 1.86% | -1.955 us | -10.25% | FAIL | | I64 | U64 | false | 2^24 | 0 | 122.519 us | 0.82% | 104.278 us | 0.92% | -18.241 us | -14.89% | FAIL | | I64 | U64 | false | 2^28 | 0 | 1.627 ms | 0.50% | 1.331 ms | 0.67% | -295.762 us | -18.18% | FAIL | | I128 | I32 | false | 2^16 | 1 | 11.884 us | 1.97% | 11.512 us | 1.83% | -0.371 us | -3.12% | FAIL | | I128 | I32 | false | 2^20 | 1 | 30.717 us | 2.13% | 30.330 us | 2.08% | -0.388 us | -1.26% | PASS | | I128 | I32 | false | 2^24 | 1 | 342.904 us | 1.35% | 342.639 us | 1.35% | -0.265 us | -0.08% | PASS | | I128 | I32 | false | 2^28 | 1 | 5.353 ms | 0.34% | 5.353 ms | 0.42% | -0.384 us | -0.01% | PASS | | I128 | I32 | false | 2^16 | 0.544 | 11.689 us | 2.15% | 11.360 us | 2.05% | -0.329 us | -2.81% | FAIL | | I128 | I32 | false | 2^20 | 0.544 | 28.616 us | 2.74% | 28.207 us | 2.48% | -0.409 us | -1.43% | PASS | | I128 | I32 | false | 2^24 | 0.544 | 280.021 us | 1.49% | 279.679 us | 1.46% | -0.342 us | -0.12% | PASS | | I128 | I32 | false | 2^28 | 0.544 | 4.277 ms | 0.37% | 4.274 ms | 0.38% | -2.274 us | -0.05% | PASS | | I128 | I32 | false | 2^16 | 0 | 10.794 us | 2.58% | 10.460 us | 2.09% | -0.334 us | -3.10% | FAIL | | I128 | I32 | false | 2^20 | 0 | 26.136 us | 2.29% | 25.833 us | 2.31% | -0.303 us | -1.16% | PASS | | I128 | I32 | false | 2^24 | 0 | 183.619 us | 1.17% | 183.330 us | 1.17% | -0.289 us | -0.16% | PASS | | I128 | I32 | false | 2^28 | 0 | 2.513 ms | 0.25% | 2.514 ms | 0.28% | 0.905 us | 0.04% | PASS | | I128 | U32 | false | 2^16 | 1 | 11.098 us | 1.96% | 11.092 us | 1.97% | -0.006 us | -0.06% | PASS | | I128 | U32 | false | 2^20 | 1 | 30.321 us | 2.41% | 30.286 us | 2.41% | -0.035 us | -0.11% | PASS | | I128 | U32 | false | 2^24 | 1 | 344.149 us | 1.35% | 344.316 us | 1.30% | 0.167 us | 0.05% | PASS | | I128 | U32 | false | 2^28 | 1 | 5.384 ms | 0.38% | 5.378 ms | 0.37% | -5.226 us | -0.10% | PASS | | I128 | U32 | false | 2^16 | 0.544 | 10.933 us | 2.10% | 10.912 us | 2.09% | -0.021 us | -0.19% | PASS | | I128 | U32 | false | 2^20 | 0.544 | 28.412 us | 2.82% | 28.508 us | 2.80% | 0.096 us | 0.34% | PASS | | I128 | U32 | false | 2^24 | 0.544 | 281.071 us | 1.47% | 281.121 us | 1.47% | 0.051 us | 0.02% | PASS | | I128 | U32 | false | 2^28 | 0.544 | 4.301 ms | 0.42% | 4.302 ms | 0.43% | 0.838 us | 0.02% | PASS | | I128 | U32 | false | 2^16 | 0 | 10.229 us | 2.08% | 10.143 us | 2.12% | -0.087 us | -0.85% | PASS | | I128 | U32 | false | 2^20 | 0 | 25.853 us | 2.45% | 25.930 us | 2.46% | 0.077 us | 0.30% | PASS | | I128 | U32 | false | 2^24 | 0 | 183.970 us | 1.18% | 184.049 us | 1.19% | 0.079 us | 0.04% | PASS | | I128 | U32 | false | 2^28 | 0 | 2.528 ms | 0.29% | 2.528 ms | 0.27% | 0.166 us | 0.01% | PASS | | I128 | I64 | false | 2^16 | 1 | 11.497 us | 1.87% | 11.463 us | 1.69% | -0.035 us | -0.30% | PASS | | I128 | I64 | false | 2^20 | 1 | 32.148 us | 2.09% | 30.141 us | 1.72% | -2.007 us | -6.24% | FAIL | | I128 | I64 | false | 2^24 | 1 | 361.154 us | 1.13% | 334.873 us | 1.06% | -26.281 us | -7.28% | FAIL | | I128 | I64 | false | 2^28 | 1 | 5.687 ms | 0.27% | 5.220 ms | 0.29% | -466.832 us | -8.21% | FAIL | | I128 | I64 | false | 2^16 | 0.544 | 11.489 us | 1.51% | 11.234 us | 1.50% | -0.255 us | -2.22% | FAIL | | I128 | I64 | false | 2^20 | 0.544 | 30.437 us | 2.43% | 28.142 us | 1.79% | -2.295 us | -7.54% | FAIL | | I128 | I64 | false | 2^24 | 0.544 | 295.782 us | 1.03% | 275.627 us | 1.17% | -20.155 us | -6.81% | FAIL | | I128 | I64 | false | 2^28 | 0.544 | 4.604 ms | 0.26% | 4.229 ms | 0.30% | -374.450 us | -8.13% | FAIL | | I128 | I64 | false | 2^16 | 0 | 11.307 us | 1.55% | 10.480 us | 1.77% | -0.827 us | -7.31% | FAIL | | I128 | I64 | false | 2^20 | 0 | 28.572 us | 1.85% | 25.738 us | 1.91% | -2.834 us | -9.92% | FAIL | | I128 | I64 | false | 2^24 | 0 | 234.228 us | 0.43% | 184.332 us | 0.84% | -49.896 us | -21.30% | FAIL | | I128 | I64 | false | 2^28 | 0 | 3.431 ms | 0.09% | 2.534 ms | 0.20% | -897.459 us | -26.16% | FAIL | | I128 | U64 | false | 2^16 | 1 | 11.750 us | 2.19% | 11.671 us | 2.16% | -0.079 us | -0.67% | PASS | | I128 | U64 | false | 2^20 | 1 | 32.391 us | 2.14% | 30.457 us | 1.66% | -1.934 us | -5.97% | FAIL | | I128 | U64 | false | 2^24 | 1 | 361.308 us | 1.05% | 335.663 us | 1.09% | -25.645 us | -7.10% | FAIL | | I128 | U64 | false | 2^28 | 1 | 5.690 ms | 0.26% | 5.216 ms | 0.28% | -473.259 us | -8.32% | FAIL | | I128 | U64 | false | 2^16 | 0.544 | 11.835 us | 2.11% | 11.515 us | 1.71% | -0.320 us | -2.70% | FAIL | | I128 | U64 | false | 2^20 | 0.544 | 30.649 us | 2.56% | 28.302 us | 1.81% | -2.347 us | -7.66% | FAIL | | I128 | U64 | false | 2^24 | 0.544 | 296.084 us | 1.03% | 276.116 us | 1.21% | -19.967 us | -6.74% | FAIL | | I128 | U64 | false | 2^28 | 0.544 | 4.602 ms | 0.25% | 4.230 ms | 0.34% | -372.728 us | -8.10% | FAIL | | I128 | U64 | false | 2^16 | 0 | 11.635 us | 2.36% | 10.901 us | 2.07% | -0.733 us | -6.30% | FAIL | | I128 | U64 | false | 2^20 | 0 | 28.577 us | 1.84% | 25.518 us | 2.08% | -3.059 us | -10.71% | FAIL | | I128 | U64 | false | 2^24 | 0 | 234.356 us | 0.42% | 184.395 us | 0.83% | -49.961 us | -21.32% | FAIL | | I128 | U64 | false | 2^28 | 0 | 3.432 ms | 0.10% | 2.535 ms | 0.21% | -897.257 us | -26.15% | FAIL | | F32 | I32 | false | 2^16 | 1 | 10.434 us | 1.91% | 10.428 us | 1.80% | -0.006 us | -0.06% | PASS | | F32 | I32 | false | 2^20 | 1 | 14.972 us | 1.96% | 15.055 us | 1.63% | 0.083 us | 0.55% | PASS | | F32 | I32 | false | 2^24 | 1 | 90.343 us | 2.63% | 90.431 us | 2.66% | 0.087 us | 0.10% | PASS | | F32 | I32 | false | 2^28 | 1 | 1.366 ms | 0.96% | 1.365 ms | 0.99% | -0.580 us | -0.04% | PASS | | F32 | I32 | false | 2^16 | 0.544 | 10.127 us | 1.67% | 10.132 us | 2.49% | 0.005 us | 0.05% | PASS | | F32 | I32 | false | 2^20 | 0.544 | 14.060 us | 1.51% | 14.103 us | 1.37% | 0.043 us | 0.31% | PASS | | F32 | I32 | false | 2^24 | 0.544 | 63.984 us | 2.18% | 64.156 us | 2.15% | 0.172 us | 0.27% | PASS | | F32 | I32 | false | 2^28 | 0.544 | 772.164 us | 1.07% | 772.320 us | 1.04% | 0.156 us | 0.02% | PASS | | F32 | I32 | false | 2^16 | 0 | 9.543 us | 2.06% | 9.636 us | 2.06% | 0.093 us | 0.97% | PASS | | F32 | I32 | false | 2^20 | 0 | 14.073 us | 1.51% | 14.098 us | 1.85% | 0.025 us | 0.18% | PASS | | F32 | I32 | false | 2^24 | 0 | 62.306 us | 2.05% | 62.238 us | 2.04% | -0.068 us | -0.11% | PASS | | F32 | I32 | false | 2^28 | 0 | 648.968 us | 1.45% | 649.088 us | 1.43% | 0.119 us | 0.02% | PASS | | F32 | U32 | false | 2^16 | 1 | 9.823 us | 1.93% | 9.831 us | 1.98% | 0.007 us | 0.07% | PASS | | F32 | U32 | false | 2^20 | 1 | 14.866 us | 1.38% | 14.821 us | 1.34% | -0.045 us | -0.30% | PASS | | F32 | U32 | false | 2^24 | 1 | 90.025 us | 2.67% | 90.143 us | 2.66% | 0.118 us | 0.13% | PASS | | F32 | U32 | false | 2^28 | 1 | 1.363 ms | 1.00% | 1.363 ms | 0.99% | 0.085 us | 0.01% | PASS | | F32 | U32 | false | 2^16 | 0.544 | 9.507 us | 1.93% | 9.471 us | 1.94% | -0.035 us | -0.37% | PASS | | F32 | U32 | false | 2^20 | 0.544 | 13.984 us | 1.79% | 13.992 us | 1.78% | 0.008 us | 0.06% | PASS | | F32 | U32 | false | 2^24 | 0.544 | 63.986 us | 2.15% | 63.913 us | 2.16% | -0.073 us | -0.11% | PASS | | F32 | U32 | false | 2^28 | 0.544 | 770.759 us | 1.07% | 771.119 us | 1.03% | 0.360 us | 0.05% | PASS | | F32 | U32 | false | 2^16 | 0 | 9.544 us | 2.09% | 9.548 us | 2.08% | 0.004 us | 0.04% | PASS | | F32 | U32 | false | 2^20 | 0 | 14.076 us | 1.29% | 14.103 us | 1.62% | 0.027 us | 0.19% | PASS | | F32 | U32 | false | 2^24 | 0 | 62.263 us | 2.05% | 62.239 us | 2.05% | -0.023 us | -0.04% | PASS | | F32 | U32 | false | 2^28 | 0 | 649.214 us | 1.45% | 649.220 us | 1.44% | 0.006 us | 0.00% | PASS | | F32 | I64 | false | 2^16 | 1 | 9.684 us | 2.18% | 9.957 us | 2.02% | 0.273 us | 2.82% | FAIL | | F32 | I64 | false | 2^20 | 1 | 15.242 us | 1.45% | 15.301 us | 1.59% | 0.060 us | 0.39% | PASS | | F32 | I64 | false | 2^24 | 1 | 100.227 us | 1.51% | 92.832 us | 1.56% | -7.395 us | -7.38% | FAIL | | F32 | I64 | false | 2^28 | 1 | 1.502 ms | 0.66% | 1.370 ms | 0.66% | -131.563 us | -8.76% | FAIL | | F32 | I64 | false | 2^16 | 0.544 | 9.526 us | 2.12% | 9.519 us | 2.07% | -0.007 us | -0.07% | PASS | | F32 | I64 | false | 2^20 | 0.544 | 14.891 us | 1.46% | 14.245 us | 1.48% | -0.645 us | -4.33% | FAIL | | F32 | I64 | false | 2^24 | 0.544 | 79.071 us | 0.88% | 70.411 us | 0.97% | -8.659 us | -10.95% | FAIL | | F32 | I64 | false | 2^28 | 0.544 | 1.049 ms | 0.50% | 863.031 us | 0.66% | -185.722 us | -17.71% | FAIL | | F32 | I64 | false | 2^16 | 0 | 9.462 us | 2.30% | 9.689 us | 2.75% | 0.226 us | 2.39% | FAIL | | F32 | I64 | false | 2^20 | 0 | 14.941 us | 1.38% | 14.294 us | 1.40% | -0.648 us | -4.34% | FAIL | | F32 | I64 | false | 2^24 | 0 | 76.605 us | 0.82% | 68.824 us | 0.95% | -7.782 us | -10.16% | FAIL | | F32 | I64 | false | 2^28 | 0 | 977.621 us | 0.50% | 787.897 us | 0.83% | -189.725 us | -19.41% | FAIL | | F32 | U64 | false | 2^16 | 1 | 9.970 us | 2.02% | 10.280 us | 1.85% | 0.309 us | 3.10% | FAIL | | F32 | U64 | false | 2^20 | 1 | 15.285 us | 1.59% | 15.044 us | 1.48% | -0.241 us | -1.57% | FAIL | | F32 | U64 | false | 2^24 | 1 | 100.398 us | 1.48% | 92.708 us | 1.58% | -7.690 us | -7.66% | FAIL | | F32 | U64 | false | 2^28 | 1 | 1.502 ms | 0.65% | 1.370 ms | 0.68% | -131.974 us | -8.79% | FAIL | | F32 | U64 | false | 2^16 | 0.544 | 9.785 us | 2.35% | 9.755 us | 2.08% | -0.031 us | -0.31% | PASS | | F32 | U64 | false | 2^20 | 0.544 | 14.964 us | 1.47% | 14.048 us | 1.49% | -0.916 us | -6.12% | FAIL | | F32 | U64 | false | 2^24 | 0.544 | 79.118 us | 0.89% | 70.345 us | 0.96% | -8.773 us | -11.09% | FAIL | | F32 | U64 | false | 2^28 | 0.544 | 1.049 ms | 0.50% | 862.962 us | 0.66% | -185.950 us | -17.73% | FAIL | | F32 | U64 | false | 2^16 | 0 | 9.669 us | 2.72% | 9.787 us | 2.82% | 0.118 us | 1.22% | PASS | | F32 | U64 | false | 2^20 | 0 | 14.800 us | 1.38% | 14.073 us | 1.42% | -0.727 us | -4.91% | FAIL | | F32 | U64 | false | 2^24 | 0 | 76.663 us | 0.82% | 68.765 us | 0.96% | -7.898 us | -10.30% | FAIL | | F32 | U64 | false | 2^28 | 0 | 977.636 us | 0.50% | 787.756 us | 0.82% | -189.881 us | -19.42% | FAIL | | F64 | I32 | false | 2^16 | 1 | 10.180 us | 1.92% | 10.197 us | 1.85% | 0.017 us | 0.17% | PASS | | F64 | I32 | false | 2^20 | 1 | 17.959 us | 1.31% | 18.346 us | 1.37% | 0.387 us | 2.15% | FAIL | | F64 | I32 | false | 2^24 | 1 | 165.401 us | 1.94% | 165.419 us | 1.90% | 0.018 us | 0.01% | PASS | | F64 | I32 | false | 2^28 | 1 | 2.557 ms | 0.53% | 2.557 ms | 0.52% | 0.129 us | 0.01% | PASS | | F64 | I32 | false | 2^16 | 0.544 | 9.738 us | 2.16% | 10.153 us | 2.18% | 0.415 us | 4.26% | FAIL | | F64 | I32 | false | 2^20 | 0.544 | 16.750 us | 1.31% | 16.904 us | 1.17% | 0.154 us | 0.92% | PASS | | F64 | I32 | false | 2^24 | 0.544 | 107.458 us | 2.48% | 107.603 us | 2.41% | 0.145 us | 0.13% | PASS | | F64 | I32 | false | 2^28 | 0.544 | 1.465 ms | 0.83% | 1.465 ms | 0.83% | 0.309 us | 0.02% | PASS | | F64 | I32 | false | 2^16 | 0 | 9.645 us | 1.51% | 9.976 us | 1.61% | 0.331 us | 3.43% | FAIL | | F64 | I32 | false | 2^20 | 0 | 16.296 us | 1.40% | 16.348 us | 1.49% | 0.052 us | 0.32% | PASS | | F64 | I32 | false | 2^24 | 0 | 97.251 us | 1.84% | 97.513 us | 1.79% | 0.262 us | 0.27% | PASS | | F64 | I32 | false | 2^28 | 0 | 1.228 ms | 0.88% | 1.228 ms | 0.88% | 0.340 us | 0.03% | PASS | | F64 | U32 | false | 2^16 | 1 | 10.262 us | 2.03% | 10.387 us | 2.39% | 0.126 us | 1.22% | PASS | | F64 | U32 | false | 2^20 | 1 | 17.929 us | 1.34% | 18.145 us | 1.81% | 0.216 us | 1.21% | PASS | | F64 | U32 | false | 2^24 | 1 | 165.159 us | 1.89% | 165.357 us | 1.95% | 0.198 us | 0.12% | PASS | | F64 | U32 | false | 2^28 | 1 | 2.553 ms | 0.49% | 2.552 ms | 0.52% | -0.383 us | -0.02% | PASS | | F64 | U32 | false | 2^16 | 0.544 | 9.484 us | 2.03% | 9.691 us | 2.83% | 0.207 us | 2.18% | FAIL | | F64 | U32 | false | 2^20 | 0.544 | 16.812 us | 1.31% | 17.098 us | 1.27% | 0.287 us | 1.70% | FAIL | | F64 | U32 | false | 2^24 | 0.544 | 106.985 us | 2.40% | 107.126 us | 2.38% | 0.141 us | 0.13% | PASS | | F64 | U32 | false | 2^28 | 0.544 | 1.460 ms | 0.83% | 1.460 ms | 0.82% | 0.477 us | 0.03% | PASS | | F64 | U32 | false | 2^16 | 0 | 9.375 us | 1.61% | 9.633 us | 1.64% | 0.257 us | 2.74% | FAIL | | F64 | U32 | false | 2^20 | 0 | 16.487 us | 1.40% | 16.747 us | 1.70% | 0.260 us | 1.58% | FAIL | | F64 | U32 | false | 2^24 | 0 | 97.120 us | 1.81% | 97.519 us | 1.85% | 0.398 us | 0.41% | PASS | | F64 | U32 | false | 2^28 | 0 | 1.228 ms | 0.87% | 1.228 ms | 0.87% | 0.460 us | 0.04% | PASS | | F64 | I64 | false | 2^16 | 1 | 9.995 us | 1.88% | 10.413 us | 2.88% | 0.418 us | 4.18% | FAIL | | F64 | I64 | false | 2^20 | 1 | 19.563 us | 2.25% | 19.566 us | 3.01% | 0.003 us | 0.02% | PASS | | F64 | I64 | false | 2^24 | 1 | 181.840 us | 1.50% | 165.995 us | 1.42% | -15.845 us | -8.71% | FAIL | | F64 | I64 | false | 2^28 | 1 | 2.822 ms | 0.38% | 2.539 ms | 0.41% | -283.461 us | -10.04% | FAIL | | F64 | I64 | false | 2^16 | 0.544 | 10.008 us | 2.01% | 9.992 us | 2.85% | -0.016 us | -0.16% | PASS | | F64 | I64 | false | 2^20 | 0.544 | 18.468 us | 1.72% | 17.697 us | 1.98% | -0.772 us | -4.18% | FAIL | | F64 | I64 | false | 2^24 | 0.544 | 129.040 us | 1.07% | 110.023 us | 1.07% | -19.016 us | -14.74% | FAIL | | F64 | I64 | false | 2^28 | 0.544 | 1.761 ms | 0.50% | 1.541 ms | 0.59% | -220.227 us | -12.51% | FAIL | | F64 | I64 | false | 2^16 | 0 | 10.240 us | 1.75% | 10.167 us | 2.48% | -0.072 us | -0.71% | PASS | | F64 | I64 | false | 2^20 | 0 | 18.588 us | 1.81% | 17.280 us | 1.76% | -1.309 us | -7.04% | FAIL | | F64 | I64 | false | 2^24 | 0 | 121.236 us | 0.88% | 103.069 us | 0.97% | -18.167 us | -14.98% | FAIL | | F64 | I64 | false | 2^28 | 0 | 1.606 ms | 0.50% | 1.300 ms | 0.72% | -306.354 us | -19.07% | FAIL | | F64 | U64 | false | 2^16 | 1 | 10.253 us | 1.67% | 10.718 us | 1.95% | 0.465 us | 4.54% | FAIL | | F64 | U64 | false | 2^20 | 1 | 19.622 us | 2.37% | 19.298 us | 3.16% | -0.324 us | -1.65% | PASS | | F64 | U64 | false | 2^24 | 1 | 182.051 us | 1.50% | 166.001 us | 1.37% | -16.050 us | -8.82% | FAIL | | F64 | U64 | false | 2^28 | 1 | 2.824 ms | 0.39% | 2.540 ms | 0.44% | -283.277 us | -10.03% | FAIL | | F64 | U64 | false | 2^16 | 0.544 | 10.166 us | 2.25% | 10.189 us | 2.93% | 0.023 us | 0.22% | PASS | | F64 | U64 | false | 2^20 | 0.544 | 18.532 us | 1.67% | 17.524 us | 1.86% | -1.007 us | -5.44% | FAIL | | F64 | U64 | false | 2^24 | 0.544 | 129.031 us | 1.08% | 109.922 us | 1.08% | -19.109 us | -14.81% | FAIL | | F64 | U64 | false | 2^28 | 0.544 | 1.761 ms | 0.50% | 1.541 ms | 0.59% | -220.501 us | -12.52% | FAIL | | F64 | U64 | false | 2^16 | 0 | 10.264 us | 1.84% | 9.824 us | 2.12% | -0.440 us | -4.29% | FAIL | | F64 | U64 | false | 2^20 | 0 | 18.526 us | 1.81% | 17.060 us | 2.07% | -1.466 us | -7.91% | FAIL | | F64 | U64 | false | 2^24 | 0 | 121.144 us | 0.86% | 102.514 us | 0.95% | -18.630 us | -15.38% | FAIL | | F64 | U64 | false | 2^28 | 0 | 1.606 ms | 0.50% | 1.300 ms | 0.76% | -306.756 us | -19.10% | FAIL |

Benchmark results when using the bit-packed tile state and the same policies for 32 and 64 bit offsets:

Absolute performance numbers | T{ct} | OffsetT{ct} | IsInPlace{ct} | Elements{io} | Entropy | Samples | CPU Time | Noise | GPU Time | Noise | Elem/s | GlobalMem BW | BWUtil | |-------|-------------|---------------|------------------|---------|---------|------------|--------|------------|-------|----------|--------------|--------| | I8 | I32 | false | 2^16 = 65536 | 1.000 | 51184x | 15.065 us | 54.36% | 9.769 us | 1.83% | 6.708G | 13.390 GB/s | 0.66% | | I8 | I32 | false | 2^20 = 1048576 | 1.000 | 40976x | 17.464 us | 43.25% | 12.203 us | 1.94% | 85.930G | 171.525 GB/s | 8.41% | | I8 | I32 | false | 2^24 = 16777216 | 1.000 | 9664x | 57.134 us | 10.41% | 51.801 us | 1.28% | 323.881G | 646.503 GB/s | 31.71% | | I8 | I32 | false | 2^28 = 268435456 | 1.000 | 1057x | 702.539 us | 0.93% | 697.065 us | 0.50% | 385.094G | 768.684 GB/s | 37.70% | | I8 | I32 | false | 2^16 = 65536 | 0.544 | 50720x | 15.030 us | 52.79% | 9.858 us | 2.09% | 6.648G | 10.237 GB/s | 0.50% | | I8 | I32 | false | 2^20 = 1048576 | 0.544 | 41872x | 17.240 us | 44.44% | 11.946 us | 1.95% | 87.780G | 135.106 GB/s | 6.63% | | I8 | I32 | false | 2^24 = 16777216 | 0.544 | 10336x | 53.682 us | 11.03% | 48.391 us | 1.09% | 346.702G | 533.652 GB/s | 26.17% | | I8 | I32 | false | 2^28 = 268435456 | 0.544 | 779x | 647.722 us | 0.99% | 642.228 us | 0.50% | 417.975G | 643.284 GB/s | 31.55% | | I8 | I32 | false | 2^16 = 65536 | 0.000 | 51968x | 14.771 us | 53.73% | 9.623 us | 2.24% | 6.810G | 6.811 GB/s | 0.33% | | I8 | I32 | false | 2^20 = 1048576 | 0.000 | 43568x | 16.752 us | 46.12% | 11.477 us | 2.18% | 91.364G | 91.365 GB/s | 4.48% | | I8 | I32 | false | 2^24 = 16777216 | 0.000 | 11776x | 47.908 us | 12.84% | 42.482 us | 1.12% | 394.924G | 394.924 GB/s | 19.37% | | I8 | I32 | false | 2^28 = 268435456 | 0.000 | 1959x | 532.762 us | 1.14% | 527.411 us | 0.50% | 508.968G | 508.968 GB/s | 24.96% | | I8 | U32 | false | 2^16 = 65536 | 1.000 | 52176x | 14.894 us | 55.49% | 9.585 us | 2.07% | 6.838G | 13.648 GB/s | 0.67% | | I8 | U32 | false | 2^20 = 1048576 | 1.000 | 40288x | 17.594 us | 41.96% | 12.411 us | 1.76% | 84.486G | 168.643 GB/s | 8.27% | | I8 | U32 | false | 2^24 = 16777216 | 1.000 | 9952x | 55.684 us | 10.85% | 50.274 us | 1.21% | 333.718G | 666.139 GB/s | 32.67% | | I8 | U32 | false | 2^28 = 268435456 | 1.000 | 1108x | 683.481 us | 0.94% | 678.118 us | 0.50% | 395.854G | 790.162 GB/s | 38.75% | | I8 | U32 | false | 2^16 = 65536 | 0.544 | 52752x | 14.818 us | 56.46% | 9.479 us | 2.23% | 6.914G | 10.647 GB/s | 0.52% | | I8 | U32 | false | 2^20 = 1048576 | 0.544 | 41632x | 17.228 us | 43.58% | 12.014 us | 1.87% | 87.276G | 134.330 GB/s | 6.59% | | I8 | U32 | false | 2^24 = 16777216 | 0.544 | 10432x | 53.373 us | 11.31% | 47.983 us | 1.08% | 349.650G | 538.189 GB/s | 26.39% | | I8 | U32 | false | 2^28 = 268435456 | 0.544 | 786x | 642.074 us | 0.99% | 636.642 us | 0.49% | 421.643G | 648.929 GB/s | 31.83% | | I8 | U32 | false | 2^16 = 65536 | 0.000 | 53392x | 14.706 us | 57.15% | 9.366 us | 2.25% | 6.998G | 6.998 GB/s | 0.34% | | I8 | U32 | false | 2^20 = 1048576 | 0.000 | 42336x | 16.973 us | 43.92% | 11.810 us | 2.00% | 88.784G | 88.784 GB/s | 4.35% | | I8 | U32 | false | 2^24 = 16777216 | 0.000 | 11792x | 47.837 us | 12.90% | 42.402 us | 1.13% | 395.667G | 395.667 GB/s | 19.40% | | I8 | U32 | false | 2^28 = 268435456 | 0.000 | 1915x | 531.039 us | 1.14% | 525.691 us | 0.50% | 510.634G | 510.634 GB/s | 25.04% | | I8 | I64 | false | 2^16 = 65536 | 1.000 | 52128x | 14.911 us | 55.55% | 9.594 us | 2.00% | 6.831G | 13.636 GB/s | 0.67% | | I8 | I64 | false | 2^20 = 1048576 | 1.000 | 39888x | 17.734 us | 41.64% | 12.536 us | 1.70% | 83.643G | 166.959 GB/s | 8.19% | | I8 | I64 | false | 2^24 = 16777216 | 1.000 | 7904x | 68.798 us | 8.61% | 63.370 us | 0.67% | 264.752G | 528.474 GB/s | 25.92% | | I8 | I64 | false | 2^28 = 268435456 | 1.000 | 560x | 898.778 us | 0.68% | 893.315 us | 0.30% | 300.494G | 599.814 GB/s | 29.42% | | I8 | I64 | false | 2^16 = 65536 | 0.544 | 52688x | 14.808 us | 56.15% | 9.493 us | 2.10% | 6.904G | 10.632 GB/s | 0.52% | | I8 | I64 | false | 2^20 = 1048576 | 0.544 | 40784x | 17.395 us | 42.03% | 12.263 us | 2.05% | 85.510G | 131.612 GB/s | 6.45% | | I8 | I64 | false | 2^24 = 16777216 | 0.544 | 8336x | 65.526 us | 9.09% | 60.093 us | 0.65% | 279.187G | 429.731 GB/s | 21.08% | | I8 | I64 | false | 2^28 = 268435456 | 0.544 | 602x | 836.818 us | 0.73% | 831.383 us | 0.33% | 322.878G | 496.925 GB/s | 24.37% | | I8 | I64 | false | 2^16 = 65536 | 0.000 | 52896x | 14.783 us | 56.49% | 9.453 us | 2.09% | 6.933G | 6.933 GB/s | 0.34% | | I8 | I64 | false | 2^20 = 1048576 | 0.000 | 41312x | 17.349 us | 43.50% | 12.106 us | 2.36% | 86.616G | 86.616 GB/s | 4.25% | | I8 | I64 | false | 2^24 = 16777216 | 0.000 | 9232x | 59.697 us | 10.21% | 54.191 us | 0.69% | 309.595G | 309.595 GB/s | 15.18% | | I8 | I64 | false | 2^28 = 268435456 | 0.000 | 702x | 718.459 us | 0.91% | 712.957 us | 0.48% | 376.510G | 376.510 GB/s | 18.47% | | I8 | U64 | false | 2^16 = 65536 | 1.000 | 50336x | 15.317 us | 54.30% | 9.934 us | 1.89% | 6.597G | 13.168 GB/s | 0.65% | | I8 | U64 | false | 2^20 = 1048576 | 1.000 | 38576x | 18.316 us | 41.44% | 12.966 us | 2.20% | 80.871G | 161.427 GB/s | 7.92% | | I8 | U64 | false | 2^24 = 16777216 | 1.000 | 7840x | 69.232 us | 8.44% | 63.871 us | 0.68% | 262.675G | 524.328 GB/s | 25.71% | | I8 | U64 | false | 2^28 = 268435456 | 1.000 | 560x | 899.328 us | 0.70% | 893.751 us | 0.30% | 300.347G | 599.522 GB/s | 29.40% | | I8 | U64 | false | 2^16 = 65536 | 0.544 | 49648x | 15.354 us | 52.63% | 10.072 us | 1.76% | 6.507G | 10.021 GB/s | 0.49% | | I8 | U64 | false | 2^20 = 1048576 | 0.544 | 40432x | 17.732 us | 43.51% | 12.369 us | 2.51% | 84.776G | 130.483 GB/s | 6.40% | | I8 | U64 | false | 2^24 = 16777216 | 0.544 | 8304x | 65.629 us | 8.85% | 60.320 us | 0.74% | 278.136G | 428.113 GB/s | 21.00% | | I8 | U64 | false | 2^28 = 268435456 | 0.544 | 602x | 837.229 us | 0.77% | 831.632 us | 0.36% | 322.781G | 496.776 GB/s | 24.36% | | I8 | U64 | false | 2^16 = 65536 | 0.000 | 50064x | 15.267 us | 52.99% | 9.990 us | 1.79% | 6.560G | 6.561 GB/s | 0.32% | | I8 | U64 | false | 2^20 = 1048576 | 0.000 | 41872x | 17.294 us | 44.94% | 11.944 us | 2.62% | 87.792G | 87.793 GB/s | 4.31% | | I8 | U64 | false | 2^24 = 16777216 | 0.000 | 9232x | 59.566 us | 9.98% | 54.188 us | 0.79% | 309.612G | 309.612 GB/s | 15.18% | | I8 | U64 | false | 2^28 = 268435456 | 0.000 | 702x | 718.536 us | 0.91% | 712.963 us | 0.47% | 376.507G | 376.507 GB/s | 18.46% | | I16 | I32 | false | 2^16 = 65536 | 1.000 | 49232x | 15.399 us | 51.80% | 10.157 us | 2.30% | 6.452G | 25.808 GB/s | 1.27% | | I16 | I32 | false | 2^20 = 1048576 | 1.000 | 37360x | 18.755 us | 40.25% | 13.384 us | 1.43% | 78.347G | 313.387 GB/s | 15.37% | | I16 | I32 | false | 2^24 = 16777216 | 1.000 | 7920x | 68.584 us | 9.01% | 63.248 us | 3.09% | 265.261G | 1.061 TB/s | 52.04% | | I16 | I32 | false | 2^28 = 268435456 | 1.000 | 1248x | 881.576 us | 1.64% | 875.949 us | 1.51% | 306.451G | 1.226 TB/s | 60.12% | | I16 | I32 | false | 2^16 = 65536 | 0.544 | 48944x | 15.448 us | 51.37% | 10.217 us | 1.76% | 6.415G | 19.821 GB/s | 0.97% | | I16 | I32 | false | 2^20 = 1048576 | 0.544 | 37360x | 18.747 us | 40.21% | 13.384 us | 2.10% | 78.346G | 241.951 GB/s | 11.87% | | I16 | I32 | false | 2^24 = 16777216 | 0.544 | 8992x | 60.957 us | 9.90% | 55.702 us | 2.90% | 301.197G | 930.196 GB/s | 45.62% | | I16 | I32 | false | 2^28 = 268435456 | 0.544 | 1552x | 707.095 us | 1.72% | 701.589 us | 1.53% | 382.611G | 1.181 TB/s | 57.94% | | I16 | I32 | false | 2^16 = 65536 | 0.000 | 50816x | 15.058 us | 53.21% | 9.842 us | 2.32% | 6.659G | 13.317 GB/s | 0.65% | | I16 | I32 | false | 2^20 = 1048576 | 0.000 | 39888x | 17.856 us | 42.54% | 12.536 us | 1.63% | 83.648G | 167.295 GB/s | 8.20% | | I16 | I32 | false | 2^24 = 16777216 | 0.000 | 10512x | 52.833 us | 11.29% | 47.611 us | 2.55% | 352.382G | 704.764 GB/s | 34.56% | | I16 | I32 | false | 2^28 = 268435456 | 0.000 | 2144x | 515.842 us | 1.48% | 510.336 us | 1.00% | 525.998G | 1.052 TB/s | 51.59% | | I16 | U32 | false | 2^16 = 65536 | 1.000 | 50144x | 15.181 us | 52.41% | 9.972 us | 2.09% | 6.572G | 26.288 GB/s | 1.29% | | I16 | U32 | false | 2^20 = 1048576 | 1.000 | 38288x | 18.366 us | 40.67% | 13.064 us | 1.74% | 80.265G | 321.057 GB/s | 15.75% | | I16 | U32 | false | 2^24 = 16777216 | 1.000 | 8016x | 67.737 us | 9.00% | 62.474 us | 3.10% | 268.547G | 1.074 TB/s | 52.68% | | I16 | U32 | false | 2^28 = 268435456 | 1.000 | 1344x | 875.372 us | 1.59% | 869.823 us | 1.46% | 308.609G | 1.234 TB/s | 60.54% | | I16 | U32 | false | 2^16 = 65536 | 0.544 | 49968x | 15.164 us | 51.70% | 10.006 us | 1.90% | 6.549G | 20.237 GB/s | 0.99% | | I16 | U32 | false | 2^20 = 1048576 | 0.544 | 38048x | 18.440 us | 40.42% | 13.141 us | 1.89% | 79.792G | 246.418 GB/s | 12.08% | | I16 | U32 | false | 2^24 = 16777216 | 0.544 | 9008x | 60.868 us | 9.94% | 55.592 us | 2.87% | 301.792G | 932.032 GB/s | 45.71% | | I16 | U32 | false | 2^28 = 268435456 | 0.544 | 1536x | 703.324 us | 1.67% | 697.808 us | 1.47% | 384.684G | 1.188 TB/s | 58.26% | | I16 | U32 | false | 2^16 = 65536 | 0.000 | 50688x | 15.070 us | 52.99% | 9.865 us | 2.19% | 6.643G | 13.287 GB/s | 0.65% | | I16 | U32 | false | 2^20 = 1048576 | 0.000 | 39840x | 17.874 us | 42.56% | 12.550 us | 1.83% | 83.550G | 167.101 GB/s | 8.20% | | I16 | U32 | false | 2^24 = 16777216 | 0.000 | 10544x | 52.893 us | 11.78% | 47.454 us | 2.58% | 353.547G | 707.094 GB/s | 34.68% | | I16 | U32 | false | 2^28 = 268435456 | 0.000 | 2176x | 515.010 us | 1.47% | 509.623 us | 1.02% | 526.733G | 1.053 TB/s | 51.66% | | I16 | I64 | false | 2^16 = 65536 | 1.000 | 50768x | 15.205 us | 54.56% | 9.849 us | 2.34% | 6.654G | 26.616 GB/s | 1.31% | | I16 | I64 | false | 2^20 = 1048576 | 1.000 | 36976x | 18.653 us | 38.01% | 13.527 us | 1.46% | 77.518G | 310.069 GB/s | 15.21% | | I16 | I64 | false | 2^24 = 16777216 | 1.000 | 7376x | 73.352 us | 8.48% | 67.906 us | 2.56% | 247.064G | 988.250 GB/s | 48.47% | | I16 | I64 | false | 2^28 = 268435456 | 1.000 | 1312x | 936.132 us | 1.28% | 930.638 us | 1.13% | 288.442G | 1.154 TB/s | 56.58% | | I16 | I64 | false | 2^16 = 65536 | 0.544 | 50976x | 15.165 us | 56.57% | 9.811 us | 2.58% | 6.680G | 20.641 GB/s | 1.01% | | I16 | I64 | false | 2^20 = 1048576 | 0.544 | 36608x | 18.862 us | 38.22% | 13.662 us | 1.47% | 76.753G | 237.031 GB/s | 11.62% | | I16 | I64 | false | 2^24 = 16777216 | 0.544 | 8176x | 66.627 us | 9.15% | 61.223 us | 2.24% | 274.035G | 846.310 GB/s | 41.51% | | I16 | I64 | false | 2^28 = 268435456 | 0.544 | 1856x | 791.507 us | 1.07% | 786.068 us | 0.81% | 341.491G | 1.055 TB/s | 51.72% | | I16 | I64 | false | 2^16 = 65536 | 0.000 | 51712x | 15.017 us | 55.46% | 9.670 us | 2.27% | 6.777G | 13.555 GB/s | 0.66% | | I16 | I64 | false | 2^20 = 1048576 | 0.000 | 38160x | 18.264 us | 39.50% | 13.107 us | 1.56% | 80.004G | 160.009 GB/s | 7.85% | | I16 | I64 | false | 2^24 = 16777216 | 0.000 | 9264x | 59.261 us | 9.96% | 54.009 us | 1.95% | 310.637G | 621.274 GB/s | 30.47% | | I16 | I64 | false | 2^28 = 268435456 | 0.000 | 2048x | 641.988 us | 1.07% | 636.548 us | 0.64% | 421.705G | 843.410 GB/s | 41.36% | | I16 | U64 | false | 2^16 = 65536 | 1.000 | 50080x | 15.108 us | 51.53% | 9.984 us | 2.32% | 6.564G | 26.257 GB/s | 1.29% | | I16 | U64 | false | 2^20 = 1048576 | 1.000 | 37104x | 18.786 us | 39.46% | 13.481 us | 1.96% | 77.783G | 311.129 GB/s | 15.26% | | I16 | U64 | false | 2^24 = 16777216 | 1.000 | 7344x | 73.415 us | 8.13% | 68.170 us | 2.57% | 246.108G | 984.424 GB/s | 48.28% | | I16 | U64 | false | 2^28 = 268435456 | 1.000 | 1584x | 936.036 us | 1.29% | 930.460 us | 1.14% | 288.498G | 1.154 TB/s | 56.59% | | I16 | U64 | false | 2^16 = 65536 | 0.544 | 49520x | 15.300 us | 51.66% | 10.099 us | 2.03% | 6.489G | 20.052 GB/s | 0.98% | | I16 | U64 | false | 2^20 = 1048576 | 0.544 | 37568x | 18.617 us | 39.95% | 13.311 us | 1.76% | 78.773G | 243.269 GB/s | 11.93% | | I16 | U64 | false | 2^24 = 16777216 | 0.544 | 8192x | 66.422 us | 8.97% | 61.134 us | 2.27% | 274.433G | 847.538 GB/s | 41.57% | | I16 | U64 | false | 2^28 = 268435456 | 0.544 | 1888x | 792.087 us | 1.08% | 786.573 us | 0.82% | 341.272G | 1.054 TB/s | 51.68% | | I16 | U64 | false | 2^16 = 65536 | 0.000 | 50208x | 15.176 us | 52.54% | 9.961 us | 1.96% | 6.579G | 13.159 GB/s | 0.65% | | I16 | U64 | false | 2^20 = 1048576 | 0.000 | 38752x | 18.226 us | 41.31% | 12.908 us | 1.77% | 81.238G | 162.476 GB/s | 7.97% | | I16 | U64 | false | 2^24 = 16777216 | 0.000 | 9312x | 58.988 us | 9.99% | 53.742 us | 1.94% | 312.183G | 624.367 GB/s | 30.62% | | I16 | U64 | false | 2^28 = 268435456 | 0.000 | 2064x | 642.171 us | 1.08% | 636.639 us | 0.64% | 421.645G | 843.289 GB/s | 41.36% | | I32 | I32 | false | 2^16 = 65536 | 1.000 | 49440x | 15.307 us | 51.55% | 10.115 us | 1.92% | 6.479G | 51.835 GB/s | 2.54% | | I32 | I32 | false | 2^20 = 1048576 | 1.000 | 33904x | 20.032 us | 35.90% | 14.750 us | 1.55% | 71.092G | 568.733 GB/s | 27.89% | | I32 | I32 | false | 2^24 = 16777216 | 1.000 | 5552x | 95.588 us | 6.63% | 90.144 us | 2.65% | 186.115G | 1.489 TB/s | 73.02% | | I32 | I32 | false | 2^28 = 268435456 | 1.000 | 1472x | 1.348 ms | 1.04% | 1.342 ms | 0.95% | 199.974G | 1.600 TB/s | 78.46% | | I32 | I32 | false | 2^16 = 65536 | 0.544 | 51168x | 15.109 us | 54.76% | 9.772 us | 2.11% | 6.707G | 41.447 GB/s | 2.03% | | I32 | I32 | false | 2^20 = 1048576 | 0.544 | 34208x | 19.806 us | 35.62% | 14.622 us | 1.85% | 71.714G | 442.940 GB/s | 21.72% | | I32 | I32 | false | 2^24 = 16777216 | 0.544 | 6400x | 83.717 us | 7.45% | 78.300 us | 2.72% | 214.269G | 1.323 TB/s | 64.91% | | I32 | I32 | false | 2^28 = 268435456 | 0.544 | 1584x | 1.077 ms | 1.31% | 1.072 ms | 1.13% | 250.508G | 1.547 TB/s | 75.88% | | I32 | I32 | false | 2^16 = 65536 | 0.000 | 52064x | 14.896 us | 55.25% | 9.606 us | 2.13% | 6.822G | 27.289 GB/s | 1.34% | | I32 | I32 | false | 2^20 = 1048576 | 0.000 | 35616x | 19.236 us | 37.13% | 14.045 us | 1.59% | 74.660G | 298.640 GB/s | 14.65% | | I32 | I32 | false | 2^24 = 16777216 | 0.000 | 8000x | 68.049 us | 8.99% | 62.602 us | 2.06% | 267.997G | 1.072 TB/s | 52.57% | | I32 | I32 | false | 2^28 = 268435456 | 0.000 | 2736x | 654.486 us | 1.68% | 649.091 us | 1.46% | 413.556G | 1.654 TB/s | 81.13% | | I32 | U32 | false | 2^16 = 65536 | 1.000 | 50832x | 15.143 us | 54.06% | 9.838 us | 2.02% | 6.662G | 53.293 GB/s | 2.61% | | I32 | U32 | false | 2^20 = 1048576 | 1.000 | 33328x | 20.206 us | 34.76% | 15.007 us | 1.39% | 69.873G | 558.981 GB/s | 27.41% | | I32 | U32 | false | 2^24 = 16777216 | 1.000 | 5568x | 95.474 us | 6.61% | 90.046 us | 2.68% | 186.319G | 1.491 TB/s | 73.10% | | I32 | U32 | false | 2^28 = 268435456 | 1.000 | 1472x | 1.348 ms | 1.08% | 1.342 ms | 0.99% | 200.040G | 1.600 TB/s | 78.48% | | I32 | U32 | false | 2^16 = 65536 | 0.544 | 51232x | 15.080 us | 54.66% | 9.761 us | 1.99% | 6.714G | 41.495 GB/s | 2.04% | | I32 | U32 | false | 2^20 = 1048576 | 0.544 | 33840x | 19.968 us | 35.25% | 14.779 us | 1.61% | 70.949G | 438.216 GB/s | 21.49% | | I32 | U32 | false | 2^24 = 16777216 | 0.544 | 6384x | 83.787 us | 7.47% | 78.326 us | 2.64% | 214.196G | 1.323 TB/s | 64.88% | | I32 | U32 | false | 2^28 = 268435456 | 0.544 | 1504x | 1.076 ms | 1.23% | 1.071 ms | 1.11% | 250.753G | 1.549 TB/s | 75.95% | | I32 | U32 | false | 2^16 = 65536 | 0.000 | 52240x | 14.848 us | 55.24% | 9.572 us | 2.19% | 6.847G | 27.387 GB/s | 1.34% | | I32 | U32 | false | 2^20 = 1048576 | 0.000 | 35424x | 19.281 us | 36.71% | 14.117 us | 1.49% | 74.280G | 297.121 GB/s | 14.57% | | I32 | U32 | false | 2^24 = 16777216 | 0.000 | 8000x | 68.045 us | 9.03% | 62.559 us | 2.05% | 268.181G | 1.073 TB/s | 52.61% | | I32 | U32 | false | 2^28 = 268435456 | 0.000 | 2768x | 654.946 us | 1.70% | 649.470 us | 1.47% | 413.315G | 1.653 TB/s | 81.08% | | I32 | I64 | false | 2^16 = 65536 | 1.000 | 48240x | 15.731 us | 51.91% | 10.366 us | 2.53% | 6.322G | 50.580 GB/s | 2.48% | | I32 | I64 | false | 2^20 = 1048576 | 1.000 | 32224x | 20.814 us | 34.24% | 15.524 us | 1.72% | 67.547G | 540.376 GB/s | 26.50% | | I32 | I64 | false | 2^24 = 16777216 | 1.000 | 5184x | 102.211 us | 6.18% | 96.674 us | 2.31% | 173.544G | 1.388 TB/s | 68.09% | | I32 | I64 | false | 2^28 = 268435456 | 1.000 | 1472x | 1.413 ms | 0.91% | 1.408 ms | 0.81% | 190.691G | 1.526 TB/s | 74.82% | | I32 | I64 | false | 2^16 = 65536 | 0.544 | 48496x | 15.707 us | 52.52% | 10.312 us | 2.71% | 6.355G | 39.275 GB/s | 1.93% | | I32 | I64 | false | 2^20 = 1048576 | 0.544 | 32896x | 20.476 us | 34.84% | 15.202 us | 1.48% | 68.975G | 426.024 GB/s | 20.89% | | I32 | I64 | false | 2^24 = 16777216 | 0.544 | 5952x | 89.600 us | 6.68% | 84.217 us | 1.90% | 199.215G | 1.230 TB/s | 60.35% | | I32 | I64 | false | 2^28 = 268435456 | 0.544 | 1888x | 1.159 ms | 0.89% | 1.153 ms | 0.74% | 232.771G | 1.438 TB/s | 70.50% | | I32 | I64 | false | 2^16 = 65536 | 0.000 | 48912x | 15.523 us | 52.03% | 10.224 us | 1.89% | 6.410G | 25.642 GB/s | 1.26% | | I32 | I64 | false | 2^20 = 1048576 | 0.000 | 34384x | 19.880 us | 36.74% | 14.547 us | 1.67% | 72.081G | 288.326 GB/s | 14.14% | | I32 | I64 | false | 2^24 = 16777216 | 0.000 | 7152x | 75.318 us | 7.71% | 69.995 us | 1.20% | 239.693G | 958.771 GB/s | 47.02% | | I32 | I64 | false | 2^28 = 268435456 | 0.000 | 2816x | 812.419 us | 1.07% | 806.833 us | 0.82% | 332.702G | 1.331 TB/s | 65.27% | | I32 | U64 | false | 2^16 = 65536 | 1.000 | 47296x | 15.846 us | 50.08% | 10.575 us | 2.56% | 6.197G | 49.578 GB/s | 2.43% | | I32 | U64 | false | 2^20 = 1048576 | 1.000 | 32624x | 20.690 us | 35.05% | 15.333 us | 2.02% | 68.386G | 547.085 GB/s | 26.83% | | I32 | U64 | false | 2^24 = 16777216 | 1.000 | 5184x | 102.120 us | 6.13% | 96.669 us | 2.36% | 173.553G | 1.388 TB/s | 68.09% | | I32 | U64 | false | 2^28 = 268435456 | 1.000 | 1488x | 1.412 ms | 0.94% | 1.407 ms | 0.85% | 190.821G | 1.527 TB/s | 74.87% | | I32 | U64 | false | 2^16 = 65536 | 0.544 | 48032x | 15.659 us | 50.63% | 10.412 us | 2.63% | 6.295G | 38.901 GB/s | 1.91% | | I32 | U64 | false | 2^20 = 1048576 | 0.544 | 33456x | 20.343 us | 36.22% | 14.946 us | 1.54% | 70.159G | 433.335 GB/s | 21.25% | | I32 | U64 | false | 2^24 = 16777216 | 0.544 | 5952x | 89.553 us | 6.65% | 84.192 us | 1.90% | 199.274G | 1.231 TB/s | 60.36% | | I32 | U64 | false | 2^28 = 268435456 | 0.544 | 1920x | 1.159 ms | 0.87% | 1.153 ms | 0.71% | 232.791G | 1.438 TB/s | 70.51% | | I32 | U64 | false | 2^16 = 65536 | 0.000 | 48592x | 15.586 us | 51.82% | 10.293 us | 2.15% | 6.367G | 25.469 GB/s | 1.25% | | I32 | U64 | false | 2^20 = 1048576 | 0.000 | 34688x | 19.767 us | 37.20% | 14.417 us | 1.33% | 72.730G | 290.921 GB/s | 14.27% | | I32 | U64 | false | 2^24 = 16777216 | 0.000 | 7168x | 75.270 us | 7.88% | 69.853 us | 1.22% | 240.178G | 960.712 GB/s | 47.12% | | I32 | U64 | false | 2^28 = 268435456 | 0.000 | 2800x | 812.477 us | 1.08% | 806.894 us | 0.83% | 332.678G | 1.331 TB/s | 65.26% | | I64 | I32 | false | 2^16 = 65536 | 1.000 | 47232x | 15.919 us | 50.53% | 10.589 us | 1.82% | 6.189G | 99.026 GB/s | 4.86% | | I64 | I32 | false | 2^20 = 1048576 | 1.000 | 27360x | 23.601 us | 29.19% | 18.283 us | 1.24% | 57.352G | 917.628 GB/s | 45.00% | | I64 | I32 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.300 us | 3.77% | 165.891 us | 1.87% | 101.134G | 1.618 TB/s | 79.36% | | I64 | I32 | false | 2^28 = 268435456 | 1.000 | 336x | 2.564 ms | 0.58% | 2.558 ms | 0.54% | 104.949G | 1.679 TB/s | 82.35% | | I64 | I32 | false | 2^16 = 65536 | 0.544 | 47600x | 15.759 us | 50.21% | 10.504 us | 1.91% | 6.239G | 77.113 GB/s | 3.78% | | I64 | I32 | false | 2^20 = 1048576 | 0.544 | 27856x | 23.273 us | 29.72% | 17.953 us | 1.76% | 58.405G | 721.478 GB/s | 35.38% | | I64 | I32 | false | 2^24 = 16777216 | 0.544 | 3616x | 143.986 us | 4.69% | 138.601 us | 2.61% | 121.047G | 1.495 TB/s | 73.34% | | I64 | I32 | false | 2^28 = 268435456 | 0.544 | 688x | 2.032 ms | 0.82% | 2.026 ms | 0.77% | 132.486G | 1.636 TB/s | 80.26% | | I64 | I32 | false | 2^16 = 65536 | 0.000 | 49664x | 15.353 us | 52.70% | 10.070 us | 2.42% | 6.508G | 52.063 GB/s | 2.55% | | I64 | I32 | false | 2^20 = 1048576 | 0.000 | 29952x | 22.041 us | 32.13% | 16.695 us | 1.62% | 62.806G | 502.451 GB/s | 24.64% | | I64 | I32 | false | 2^24 = 16777216 | 0.000 | 5136x | 103.040 us | 5.82% | 97.655 us | 1.83% | 171.801G | 1.374 TB/s | 67.40% | | I64 | I32 | false | 2^28 = 268435456 | 0.000 | 2400x | 1.237 ms | 1.01% | 1.231 ms | 0.90% | 218.089G | 1.745 TB/s | 85.57% | | I64 | U32 | false | 2^16 = 65536 | 1.000 | 47888x | 15.706 us | 50.61% | 10.442 us | 2.24% | 6.276G | 100.421 GB/s | 4.92% | | I64 | U32 | false | 2^20 = 1048576 | 1.000 | 27488x | 23.499 us | 29.30% | 18.191 us | 1.38% | 57.643G | 922.286 GB/s | 45.23% | | I64 | U32 | false | 2^24 = 16777216 | 1.000 | 3024x | 171.172 us | 3.82% | 165.697 us | 1.90% | 101.252G | 1.620 TB/s | 79.45% | | I64 | U32 | false | 2^28 = 268435456 | 1.000 | 196x | 2.565 ms | 0.54% | 2.559 ms | 0.49% | 104.883G | 1.678 TB/s | 82.30% | | I64 | U32 | false | 2^16 = 65536 | 0.544 | 48336x | 15.662 us | 51.57% | 10.346 us | 1.40% | 6.334G | 78.290 GB/s | 3.84% | | I64 | U32 | false | 2^20 = 1048576 | 0.544 | 27360x | 23.621 us | 29.35% | 18.278 us | 1.32% | 57.367G | 708.652 GB/s | 34.75% | | I64 | U32 | false | 2^24 = 16777216 | 0.544 | 3616x | 144.139 us | 4.86% | 138.624 us | 2.62% | 121.027G | 1.495 TB/s | 73.32% | | I64 | U32 | false | 2^28 = 268435456 | 0.544 | 752x | 2.033 ms | 0.81% | 2.027 ms | 0.76% | 132.404G | 1.635 TB/s | 80.21% | | I64 | U32 | false | 2^16 = 65536 | 0.000 | 50768x | 15.214 us | 54.56% | 9.851 us | 2.16% | 6.653G | 53.223 GB/s | 2.61% | | I64 | U32 | false | 2^20 = 1048576 | 0.000 | 29600x | 22.197 us | 31.52% | 16.895 us | 1.26% | 62.066G | 496.529 GB/s | 24.35% | | I64 | U32 | false | 2^24 = 16777216 | 0.000 | 5120x | 103.209 us | 5.91% | 97.741 us | 1.86% | 171.650G | 1.373 TB/s | 67.35% | | I64 | U32 | false | 2^28 = 268435456 | 0.000 | 2368x | 1.237 ms | 0.98% | 1.232 ms | 0.87% | 217.952G | 1.744 TB/s | 85.51% | | I64 | I64 | false | 2^16 = 65536 | 1.000 | 47600x | 15.835 us | 50.88% | 10.505 us | 2.22% | 6.238G | 99.815 GB/s | 4.90% | | I64 | I64 | false | 2^20 = 1048576 | 1.000 | 25264x | 25.011 us | 26.71% | 19.794 us | 3.51% | 52.975G | 847.593 GB/s | 41.57% | | I64 | I64 | false | 2^24 = 16777216 | 1.000 | 2896x | 178.605 us | 3.74% | 173.111 us | 1.96% | 96.916G | 1.551 TB/s | 76.05% | | I64 | I64 | false | 2^28 = 268435456 | 1.000 | 528x | 2.664 ms | 0.58% | 2.658 ms | 0.54% | 100.976G | 1.616 TB/s | 79.23% | | I64 | I64 | false | 2^16 = 65536 | 0.544 | 46832x | 16.019 us | 50.16% | 10.678 us | 1.71% | 6.138G | 75.860 GB/s | 3.72% | | I64 | I64 | false | 2^20 = 1048576 | 0.544 | 26064x | 24.418 us | 27.43% | 19.194 us | 2.48% | 54.631G | 674.860 GB/s | 33.10% | | I64 | I64 | false | 2^24 = 16777216 | 0.544 | 3504x | 148.727 us | 4.31% | 143.226 us | 1.94% | 117.138G | 1.447 TB/s | 70.97% | | I64 | I64 | false | 2^28 = 268435456 | 0.544 | 864x | 2.128 ms | 0.60% | 2.123 ms | 0.53% | 126.465G | 1.562 TB/s | 76.61% | | I64 | I64 | false | 2^16 = 65536 | 0.000 | 50592x | 15.231 us | 54.22% | 9.884 us | 1.99% | 6.630G | 53.044 GB/s | 2.60% | | I64 | I64 | false | 2^20 = 1048576 | 0.000 | 28048x | 23.091 us | 29.68% | 17.831 us | 2.00% | 58.807G | 470.459 GB/s | 23.07% | | I64 | I64 | false | 2^24 = 16777216 | 0.000 | 4736x | 111.203 us | 5.35% | 105.704 us | 1.15% | 158.719G | 1.270 TB/s | 62.27% | | I64 | I64 | false | 2^28 = 268435456 | 0.000 | 2768x | 1.354 ms | 0.80% | 1.348 ms | 0.68% | 199.078G | 1.593 TB/s | 78.11% | | I64 | U64 | false | 2^16 = 65536 | 1.000 | 46672x | 16.067 us | 50.11% | 10.713 us | 2.14% | 6.117G | 97.876 GB/s | 4.80% | | I64 | U64 | false | 2^20 = 1048576 | 1.000 | 25232x | 25.045 us | 26.71% | 19.818 us | 3.47% | 52.910G | 846.561 GB/s | 41.52% | | I64 | U64 | false | 2^24 = 16777216 | 1.000 | 2896x | 178.154 us | 3.65% | 172.740 us | 1.87% | 97.124G | 1.554 TB/s | 76.21% | | I64 | U64 | false | 2^28 = 268435456 | 1.000 | 189x | 2.665 ms | 0.52% | 2.659 ms | 0.48% | 100.956G | 1.615 TB/s | 79.22% | | I64 | U64 | false | 2^16 = 65536 | 0.544 | 48672x | 15.580 us | 51.79% | 10.276 us | 1.89% | 6.378G | 78.830 GB/s | 3.87% | | I64 | U64 | false | 2^20 = 1048576 | 0.544 | 26288x | 24.210 us | 27.49% | 19.031 us | 2.50% | 55.100G | 680.646 GB/s | 33.38% | | I64 | U64 | false | 2^24 = 16777216 | 0.544 | 3504x | 148.316 us | 4.21% | 143.031 us | 1.97% | 117.298G | 1.449 TB/s | 71.06% | | I64 | U64 | false | 2^28 = 268435456 | 0.544 | 736x | 2.128 ms | 0.60% | 2.122 ms | 0.53% | 126.503G | 1.563 TB/s | 76.63% | | I64 | U64 | false | 2^16 = 65536 | 0.000 | 50544x | 15.047 us | 52.28% | 9.895 us | 1.57% | 6.623G | 52.987 GB/s | 2.60% | | I64 | U64 | false | 2^20 = 1048576 | 0.000 | 29056x | 22.499 us | 30.82% | 17.214 us | 2.00% | 60.914G | 487.315 GB/s | 23.90% | | I64 | U64 | false | 2^24 = 16777216 | 0.000 | 4752x | 110.616 us | 5.12% | 105.366 us | 1.13% | 159.228G | 1.274 TB/s | 62.47% | | I64 | U64 | false | 2^28 = 268435456 | 0.000 | 2800x | 1.354 ms | 0.79% | 1.348 ms | 0.67% | 199.130G | 1.593 TB/s | 78.13% | | I128 | I32 | false | 2^16 = 65536 | 1.000 | 43552x | 16.766 us | 46.15% | 11.483 us | 1.62% | 5.707G | 182.635 GB/s | 8.96% | | I128 | I32 | false | 2^20 = 1048576 | 1.000 | 16480x | 35.766 us | 18.07% | 30.345 us | 2.12% | 34.556G | 1.106 TB/s | 54.23% | | I128 | I32 | false | 2^24 = 16777216 | 1.000 | 1472x | 347.940 us | 2.05% | 342.597 us | 1.34% | 48.971G | 1.567 TB/s | 76.85% | | I128 | I32 | false | 2^28 = 268435456 | 1.000 | 94x | 5.356 ms | 0.35% | 5.350 ms | 0.33% | 50.175G | 1.606 TB/s | 78.74% | | I128 | I32 | false | 2^16 = 65536 | 0.544 | 43968x | 16.590 us | 46.08% | 11.374 us | 2.00% | 5.762G | 142.429 GB/s | 6.99% | | I128 | I32 | false | 2^20 = 1048576 | 0.544 | 17632x | 33.671 us | 18.95% | 28.367 us | 2.65% | 36.964G | 913.237 GB/s | 44.79% | | I128 | I32 | false | 2^24 = 16777216 | 0.544 | 1792x | 285.140 us | 2.55% | 279.740 us | 1.46% | 59.974G | 1.482 TB/s | 72.67% | | I128 | I32 | false | 2^28 = 268435456 | 0.544 | 117x | 4.282 ms | 0.39% | 4.276 ms | 0.37% | 62.780G | 1.551 TB/s | 76.06% | | I128 | I32 | false | 2^16 = 65536 | 0.000 | 47584x | 15.734 us | 49.92% | 10.510 us | 2.00% | 6.235G | 99.765 GB/s | 4.89% | | I128 | I32 | false | 2^20 = 1048576 | 0.000 | 19408x | 31.216 us | 21.28% | 25.776 us | 2.31% | 40.681G | 650.889 GB/s | 31.92% | | I128 | I32 | false | 2^24 = 16777216 | 0.000 | 2736x | 188.607 us | 3.19% | 183.165 us | 1.13% | 91.596G | 1.466 TB/s | 71.87% | | I128 | I32 | false | 2^28 = 268435456 | 0.000 | 200x | 2.518 ms | 0.34% | 2.512 ms | 0.25% | 106.842G | 1.709 TB/s | 83.84% | | I128 | U32 | false | 2^16 = 65536 | 1.000 | 44944x | 16.398 us | 47.50% | 11.126 us | 1.95% | 5.890G | 188.483 GB/s | 9.24% | | I128 | U32 | false | 2^20 = 1048576 | 1.000 | 16480x | 35.624 us | 17.62% | 30.349 us | 2.41% | 34.551G | 1.106 TB/s | 54.22% | | I128 | U32 | false | 2^24 = 16777216 | 1.000 | 1456x | 350.054 us | 2.05% | 344.605 us | 1.30% | 48.685G | 1.558 TB/s | 76.41% | | I128 | U32 | false | 2^28 = 268435456 | 1.000 | 93x | 5.388 ms | 0.35% | 5.382 ms | 0.33% | 49.876G | 1.596 TB/s | 78.27% | | I128 | U32 | false | 2^16 = 65536 | 0.544 | 45632x | 16.250 us | 48.47% | 10.958 us | 1.83% | 5.980G | 147.836 GB/s | 7.25% | | I128 | U32 | false | 2^20 = 1048576 | 0.544 | 17568x | 33.706 us | 18.73% | 28.462 us | 2.83% | 36.841G | 910.185 GB/s | 44.64% | | I128 | U32 | false | 2^24 = 16777216 | 0.544 | 1792x | 286.572 us | 2.45% | 281.105 us | 1.48% | 59.683G | 1.475 TB/s | 72.32% | | I128 | U32 | false | 2^28 = 268435456 | 0.544 | 117x | 4.307 ms | 0.42% | 4.301 ms | 0.39% | 62.406G | 1.542 TB/s | 75.61% | | I128 | U32 | false | 2^16 = 65536 | 0.000 | 49184x | 15.471 us | 52.28% | 10.168 us | 2.22% | 6.445G | 103.125 GB/s | 5.06% | | I128 | U32 | false | 2^20 = 1048576 | 0.000 | 19376x | 31.136 us | 20.81% | 25.820 us | 2.48% | 40.611G | 649.780 GB/s | 31.87% | | I128 | U32 | false | 2^24 = 16777216 | 0.000 | 2720x | 189.195 us | 3.11% | 183.907 us | 1.17% | 91.227G | 1.460 TB/s | 71.58% | | I128 | U32 | false | 2^28 = 268435456 | 0.000 | 198x | 2.534 ms | 0.36% | 2.529 ms | 0.29% | 106.160G | 1.699 TB/s | 83.30% | | I128 | I64 | false | 2^16 = 65536 | 1.000 | 42896x | 16.889 us | 45.07% | 11.659 us | 1.83% | 5.621G | 179.880 GB/s | 8.82% | | I128 | I64 | false | 2^20 = 1048576 | 1.000 | 16400x | 35.957 us | 18.07% | 30.510 us | 2.46% | 34.369G | 1.100 TB/s | 53.94% | | I128 | I64 | false | 2^24 = 16777216 | 1.000 | 1440x | 352.572 us | 1.94% | 347.250 us | 1.19% | 48.314G | 1.546 TB/s | 75.82% | | I128 | I64 | false | 2^28 = 268435456 | 1.000 | 93x | 5.433 ms | 0.35% | 5.427 ms | 0.33% | 49.467G | 1.583 TB/s | 77.63% | | I128 | I64 | false | 2^16 = 65536 | 0.544 | 43888x | 16.532 us | 45.28% | 11.396 us | 1.69% | 5.751G | 142.163 GB/s | 6.97% | | I128 | I64 | false | 2^20 = 1048576 | 0.544 | 17536x | 33.863 us | 18.96% | 28.534 us | 2.93% | 36.749G | 907.914 GB/s | 44.53% | | I128 | I64 | false | 2^24 = 16777216 | 0.544 | 1760x | 289.630 us | 2.31% | 284.354 us | 1.37% | 59.001G | 1.458 TB/s | 71.49% | | I128 | I64 | false | 2^28 = 268435456 | 0.544 | 115x | 4.365 ms | 0.38% | 4.359 ms | 0.36% | 61.587G | 1.521 TB/s | 74.62% | | I128 | I64 | false | 2^16 = 65536 | 0.000 | 46512x | 15.964 us | 48.77% | 10.750 us | 1.81% | 6.096G | 97.540 GB/s | 4.78% | | I128 | I64 | false | 2^20 = 1048576 | 0.000 | 19008x | 31.831 us | 21.11% | 26.315 us | 2.23% | 39.846G | 637.542 GB/s | 31.27% | | I128 | I64 | false | 2^24 = 16777216 | 0.000 | 2656x | 194.491 us | 3.04% | 189.124 us | 1.05% | 88.710G | 1.419 TB/s | 69.61% | | I128 | I64 | false | 2^28 = 268435456 | 0.000 | 192x | 2.614 ms | 0.34% | 2.608 ms | 0.25% | 102.936G | 1.647 TB/s | 80.77% | | I128 | U64 | false | 2^16 = 65536 | 1.000 | 42368x | 17.042 us | 44.56% | 11.802 us | 2.13% | 5.553G | 177.700 GB/s | 8.71% | | I128 | U64 | false | 2^20 = 1048576 | 1.000 | 16224x | 36.373 us | 18.22% | 30.824 us | 2.41% | 34.018G | 1.089 TB/s | 53.39% | | I128 | U64 | false | 2^24 = 16777216 | 1.000 | 1440x | 352.978 us | 1.98% | 347.450 us | 1.18% | 48.287G | 1.545 TB/s | 75.78% | | I128 | U64 | false | 2^28 = 268435456 | 1.000 | 93x | 5.437 ms | 0.36% | 5.431 ms | 0.34% | 49.430G | 1.582 TB/s | 77.57% | | I128 | U64 | false | 2^16 = 65536 | 0.544 | 42672x | 16.944 us | 44.81% | 11.719 us | 2.37% | 5.592G | 138.235 GB/s | 6.78% | | I128 | U64 | false | 2^20 = 1048576 | 0.544 | 17296x | 34.416 us | 19.29% | 28.910 us | 2.92% | 36.270G | 896.094 GB/s | 43.95% | | I128 | U64 | false | 2^24 = 16777216 | 0.544 | 1760x | 289.880 us | 2.30% | 284.517 us | 1.31% | 58.967G | 1.457 TB/s | 71.45% | | I128 | U64 | false | 2^28 = 268435456 | 0.544 | 115x | 4.366 ms | 0.36% | 4.360 ms | 0.34% | 61.573G | 1.521 TB/s | 74.60% | | I128 | U64 | false | 2^16 = 65536 | 0.000 | 45504x | 16.273 us | 48.24% | 10.990 us | 1.80% | 5.963G | 95.409 GB/s | 4.68% | | I128 | U64 | false | 2^20 = 1048576 | 0.000 | 18912x | 31.984 us | 21.12% | 26.448 us | 2.33% | 39.646G | 634.343 GB/s | 31.11% | | I128 | U64 | false | 2^24 = 16777216 | 0.000 | 2656x | 194.447 us | 3.03% | 189.099 us | 1.05% | 88.722G | 1.420 TB/s | 69.62% | | I128 | U64 | false | 2^28 = 268435456 | 0.000 | 192x | 2.613 ms | 0.33% | 2.607 ms | 0.24% | 102.969G | 1.648 TB/s | 80.80% | | F32 | I32 | false | 2^16 = 65536 | 1.000 | 47920x | 15.629 us | 50.01% | 10.435 us | 2.63% | 6.281G | 50.245 GB/s | 2.46% | | F32 | I32 | false | 2^20 = 1048576 | 1.000 | 33392x | 20.292 us | 35.63% | 14.974 us | 1.38% | 70.026G | 560.210 GB/s | 27.47% | | F32 | I32 | false | 2^24 = 16777216 | 1.000 | 5536x | 95.807 us | 6.52% | 90.477 us | 2.68% | 185.431G | 1.483 TB/s | 72.75% | | F32 | I32 | false | 2^28 = 268435456 | 1.000 | 1456x | 1.371 ms | 1.03% | 1.365 ms | 0.95% | 196.588G | 1.573 TB/s | 77.13% | | F32 | I32 | false | 2^16 = 65536 | 0.544 | 49808x | 15.210 us | 51.71% | 10.039 us | 1.72% | 6.528G | 28.449 GB/s | 1.40% | | F32 | I32 | false | 2^20 = 1048576 | 0.544 | 35680x | 19.371 us | 38.30% | 14.019 us | 1.94% | 74.796G | 325.577 GB/s | 15.97% | | F32 | I32 | false | 2^24 = 16777216 | 0.544 | 7824x | 69.348 us | 8.66% | 63.994 us | 2.12% | 262.170G | 1.141 TB/s | 55.96% | | F32 | I32 | false | 2^28 = 268435456 | 0.544 | 2096x | 777.676 us | 1.28% | 772.284 us | 1.07% | 347.586G | 1.513 TB/s | 74.19% | | F32 | I32 | false | 2^16 = 65536 | 0.000 | 52496x | 14.849 us | 56.02% | 9.527 us | 2.45% | 6.879G | 27.515 GB/s | 1.35% | | F32 | I32 | false | 2^20 = 1048576 | 0.000 | 35616x | 19.170 us | 36.70% | 14.043 us | 1.80% | 74.669G | 298.676 GB/s | 14.65% | | F32 | I32 | false | 2^24 = 16777216 | 0.000 | 8032x | 67.702 us | 8.93% | 62.305 us | 2.05% | 269.277G | 1.077 TB/s | 52.82% | | F32 | I32 | false | 2^28 = 268435456 | 0.000 | 2752x | 654.680 us | 1.66% | 649.292 us | 1.43% | 413.428G | 1.654 TB/s | 81.10% | | F32 | U32 | false | 2^16 = 65536 | 1.000 | 50832x | 15.139 us | 54.00% | 9.838 us | 1.89% | 6.662G | 53.294 GB/s | 2.61% | | F32 | U32 | false | 2^20 = 1048576 | 1.000 | 33552x | 20.079 us | 34.85% | 14.905 us | 1.23% | 70.350G | 562.798 GB/s | 27.60% | | F32 | U32 | false | 2^24 = 16777216 | 1.000 | 5568x | 95.370 us | 6.63% | 89.919 us | 2.64% | 186.581G | 1.493 TB/s | 73.20% | | F32 | U32 | false | 2^28 = 268435456 | 1.000 | 1408x | 1.368 ms | 1.07% | 1.362 ms | 0.99% | 197.044G | 1.576 TB/s | 77.31% | | F32 | U32 | false | 2^16 = 65536 | 0.544 | 52848x | 14.787 us | 56.42% | 9.462 us | 2.07% | 6.927G | 30.187 GB/s | 1.48% | | F32 | U32 | false | 2^20 = 1048576 | 0.544 | 35536x | 19.226 us | 36.75% | 14.074 us | 1.64% | 74.502G | 324.300 GB/s | 15.90% | | F32 | U32 | false | 2^24 = 16777216 | 0.544 | 7824x | 69.443 us | 8.76% | 64.029 us | 2.17% | 262.026G | 1.140 TB/s | 55.93% | | F32 | U32 | false | 2^28 = 268435456 | 0.544 | 2176x | 776.399 us | 1.26% | 770.927 us | 1.04% | 348.198G | 1.515 TB/s | 74.32% | | F32 | U32 | false | 2^16 = 65536 | 0.000 | 52768x | 14.809 us | 56.44% | 9.478 us | 2.12% | 6.915G | 27.659 GB/s | 1.36% | | F32 | U32 | false | 2^20 = 1048576 | 0.000 | 35504x | 19.232 us | 36.65% | 14.087 us | 1.26% | 74.435G | 297.742 GB/s | 14.60% | | F32 | U32 | false | 2^24 = 16777216 | 0.000 | 8032x | 67.726 us | 8.99% | 62.291 us | 2.08% | 269.334G | 1.077 TB/s | 52.84% | | F32 | U32 | false | 2^28 = 268435456 | 0.000 | 2736x | 654.988 us | 1.68% | 649.568 us | 1.45% | 413.252G | 1.653 TB/s | 81.07% | | F32 | I64 | false | 2^16 = 65536 | 1.000 | 50224x | 15.254 us | 53.32% | 9.956 us | 1.91% | 6.583G | 52.662 GB/s | 2.58% | | F32 | I64 | false | 2^20 = 1048576 | 1.000 | 32672x | 20.549 us | 34.37% | 15.310 us | 1.47% | 68.489G | 547.916 GB/s | 26.87% | | F32 | I64 | false | 2^24 = 16777216 | 1.000 | 5216x | 101.521 us | 6.19% | 96.061 us | 2.38% | 174.652G | 1.397 TB/s | 68.52% | | F32 | I64 | false | 2^28 = 268435456 | 1.000 | 1584x | 1.438 ms | 0.91% | 1.433 ms | 0.82% | 187.368G | 1.499 TB/s | 73.51% | | F32 | I64 | false | 2^16 = 65536 | 0.544 | 50160x | 15.338 us | 54.13% | 9.969 us | 2.21% | 6.574G | 28.650 GB/s | 1.41% | | F32 | I64 | false | 2^20 = 1048576 | 0.544 | 34368x | 19.823 us | 36.30% | 14.554 us | 1.39% | 72.045G | 313.605 GB/s | 15.38% | | F32 | I64 | false | 2^24 = 16777216 | 0.544 | 7008x | 77.010 us | 7.84% | 71.485 us | 1.27% | 234.696G | 1.021 TB/s | 50.10% | | F32 | I64 | false | 2^28 = 268435456 | 0.544 | 2672x | 884.985 us | 0.94% | 879.428 us | 0.69% | 305.239G | 1.328 TB/s | 65.15% | | F32 | I64 | false | 2^16 = 65536 | 0.000 | 50944x | 15.133 us | 54.27% | 9.817 us | 2.41% | 6.676G | 26.704 GB/s | 1.31% | | F32 | I64 | false | 2^20 = 1048576 | 0.000 | 34176x | 19.954 us | 36.52% | 14.632 us | 1.54% | 71.665G | 286.660 GB/s | 14.06% | | F32 | I64 | false | 2^24 = 16777216 | 0.000 | 7168x | 75.213 us | 7.87% | 69.812 us | 1.23% | 240.322G | 961.287 GB/s | 47.14% | | F32 | I64 | false | 2^28 = 268435456 | 0.000 | 2816x | 810.050 us | 1.10% | 804.417 us | 0.85% | 333.702G | 1.335 TB/s | 65.46% | | F32 | U64 | false | 2^16 = 65536 | 1.000 | 47632x | 15.787 us | 50.50% | 10.498 us | 1.72% | 6.243G | 49.942 GB/s | 2.45% | | F32 | U64 | false | 2^20 = 1048576 | 1.000 | 32336x | 20.773 us | 34.42% | 15.465 us | 1.69% | 67.803G | 542.421 GB/s | 26.60% | | F32 | U64 | false | 2^24 = 16777216 | 1.000 | 5184x | 102.031 us | 6.13% | 96.570 us | 2.34% | 173.731G | 1.390 TB/s | 68.16% | | F32 | U64 | false | 2^28 = 268435456 | 1.000 | 1456x | 1.439 ms | 0.91% | 1.433 ms | 0.82% | 187.261G | 1.498 TB/s | 73.47% | | F32 | U64 | false | 2^16 = 65536 | 0.544 | 49952x | 15.340 us | 53.44% | 10.012 us | 2.21% | 6.546G | 28.527 GB/s | 1.40% | | F32 | U64 | false | 2^20 = 1048576 | 0.544 | 34848x | 19.672 us | 37.18% | 14.350 us | 1.30% | 73.071G | 318.070 GB/s | 15.60% | | F32 | U64 | false | 2^24 = 16777216 | 0.544 | 7008x | 76.756 us | 7.68% | 71.381 us | 1.34% | 235.039G | 1.023 TB/s | 50.17% | | F32 | U64 | false | 2^28 = 268435456 | 0.544 | 2672x | 884.872 us | 0.94% | 879.305 us | 0.70% | 305.281G | 1.329 TB/s | 65.16% | | F32 | U64 | false | 2^16 = 65536 | 0.000 | 49600x | 15.328 us | 52.25% | 10.081 us | 2.76% | 6.501G | 26.004 GB/s | 1.28% | | F32 | U64 | false | 2^20 = 1048576 | 0.000 | 34736x | 19.732 us | 37.19% | 14.397 us | 1.33% | 72.834G | 291.338 GB/s | 14.29% | | F32 | U64 | false | 2^24 = 16777216 | 0.000 | 7184x | 75.125 us | 7.83% | 69.746 us | 1.19% | 240.548G | 962.194 GB/s | 47.19% | | F32 | U64 | false | 2^28 = 268435456 | 0.000 | 2816x | 809.681 us | 1.09% | 804.095 us | 0.83% | 333.836G | 1.335 TB/s | 65.49% | | F64 | I32 | false | 2^16 = 65536 | 1.000 | 47728x | 15.727 us | 50.31% | 10.476 us | 2.23% | 6.256G | 100.093 GB/s | 4.91% | | F64 | I32 | false | 2^20 = 1048576 | 1.000 | 27872x | 23.222 us | 29.47% | 17.947 us | 1.49% | 58.427G | 934.833 GB/s | 45.85% | | F64 | I32 | false | 2^24 = 16777216 | 1.000 | 3040x | 170.727 us | 3.82% | 165.272 us | 1.92% | 101.513G | 1.624 TB/s | 79.66% | | F64 | I32 | false | 2^28 = 268435456 | 1.000 | 528x | 2.561 ms | 0.60% | 2.555 ms | 0.55% | 105.063G | 1.681 TB/s | 82.44% | | F64 | I32 | false | 2^16 = 65536 | 0.544 | 51184x | 15.021 us | 53.92% | 9.770 us | 1.98% | 6.708G | 58.467 GB/s | 2.87% | | F64 | I32 | false | 2^20 = 1048576 | 0.544 | 29920x | 22.005 us | 31.68% | 16.719 us | 1.29% | 62.718G | 546.005 GB/s | 26.78% | | F64 | I32 | false | 2^24 = 16777216 | 0.544 | 4672x | 112.662 us | 5.52% | 107.361 us | 2.43% | 156.269G | 1.360 TB/s | 66.71% | | F64 | I32 | false | 2^28 = 268435456 | 0.544 | 1792x | 1.470 ms | 0.94% | 1.465 ms | 0.86% | 183.252G | 1.595 TB/s | 78.23% | | F64 | I32 | false | 2^16 = 65536 | 0.000 | 52032x | 14.834 us | 54.56% | 9.610 us | 1.71% | 6.819G | 54.555 GB/s | 2.68% | | F64 | I32 | false | 2^20 = 1048576 | 0.000 | 30784x | 21.517 us | 32.52% | 16.246 us | 1.41% | 64.544G | 516.354 GB/s | 25.32% | | F64 | I32 | false | 2^24 = 16777216 | 0.000 | 5152x | 102.510 us | 5.78% | 97.188 us | 1.82% | 172.627G | 1.381 TB/s | 67.73% | | F64 | I32 | false | 2^28 = 268435456 | 0.000 | 2496x | 1.233 ms | 1.00% | 1.228 ms | 0.89% | 218.611G | 1.749 TB/s | 85.77% | | F64 | U32 | false | 2^16 = 65536 | 1.000 | 49072x | 15.391 us | 51.25% | 10.191 us | 2.14% | 6.431G | 102.892 GB/s | 5.05% | | F64 | U32 | false | 2^20 = 1048576 | 1.000 | 27888x | 23.206 us | 29.52% | 17.930 us | 1.38% | 58.483G | 935.726 GB/s | 45.89% | | F64 | U32 | false | 2^24 = 16777216 | 1.000 | 3040x | 170.641 us | 3.83% | 165.186 us | 1.94% | 101.565G | 1.625 TB/s | 79.70% | | F64 | U32 | false | 2^28 = 268435456 | 1.000 | 544x | 2.559 ms | 0.60% | 2.554 ms | 0.56% | 105.116G | 1.682 TB/s | 82.48% | | F64 | U32 | false | 2^16 = 65536 | 0.544 | 52768x | 14.747 us | 55.71% | 9.476 us | 2.14% | 6.916G | 60.283 GB/s | 2.96% | | F64 | U32 | false | 2^20 = 1048576 | 0.544 | 29808x | 21.977 us | 31.10% | 16.779 us | 1.38% | 62.493G | 544.049 GB/s | 26.68% | | F64 | U32 | false | 2^24 = 16777216 | 0.544 | 4672x | 112.515 us | 5.64% | 107.064 us | 2.40% | 156.702G | 1.364 TB/s | 66.90% | | F64 | U32 | false | 2^28 = 268435456 | 0.544 | 1792x | 1.466 ms | 0.92% | 1.460 ms | 0.83% | 183.826G | 1.600 TB/s | 78.47% | | F64 | U32 | false | 2^16 = 65536 | 0.000 | 53296x | 14.694 us | 56.71% | 9.384 us | 1.92% | 6.984G | 55.870 GB/s | 2.74% | | F64 | U32 | false | 2^20 = 1048576 | 0.000 | 30288x | 21.774 us | 32.02% | 16.508 us | 1.39% | 63.518G | 508.147 GB/s | 24.92% | | F64 | U32 | false | 2^24 = 16777216 | 0.000 | 5152x | 102.568 us | 5.93% | 97.100 us | 1.84% | 172.783G | 1.382 TB/s | 67.79% | | F64 | U32 | false | 2^28 = 268435456 | 0.000 | 2528x | 1.234 ms | 1.00% | 1.228 ms | 0.89% | 218.564G | 1.749 TB/s | 85.75% | | F64 | I64 | false | 2^16 = 65536 | 1.000 | 48800x | 15.513 us | 51.48% | 10.247 us | 1.99% | 6.395G | 102.326 GB/s | 5.02% | | F64 | I64 | false | 2^20 = 1048576 | 1.000 | 25680x | 24.698 us | 27.18% | 19.471 us | 3.73% | 53.852G | 861.640 GB/s | 42.26% | | F64 | I64 | false | 2^24 = 16777216 | 1.000 | 2912x | 177.894 us | 3.69% | 172.439 us | 1.90% | 97.294G | 1.557 TB/s | 76.34% | | F64 | I64 | false | 2^28 = 268435456 | 1.000 | 189x | 2.657 ms | 0.53% | 2.651 ms | 0.49% | 101.257G | 1.620 TB/s | 79.45% | | F64 | I64 | false | 2^16 = 65536 | 0.544 | 51680x | 14.948 us | 54.58% | 9.676 us | 1.96% | 6.773G | 59.034 GB/s | 2.90% | | F64 | I64 | false | 2^20 = 1048576 | 0.544 | 28656x | 22.619 us | 29.75% | 17.453 us | 2.01% | 60.081G | 523.048 GB/s | 25.65% | | F64 | I64 | false | 2^24 = 16777216 | 0.544 | 4480x | 117.080 us | 4.90% | 111.841 us | 1.38% | 150.010G | 1.306 TB/s | 64.04% | | F64 | I64 | false | 2^28 = 268435456 | 0.544 | 2160x | 1.559 ms | 0.71% | 1.553 ms | 0.61% | 172.862G | 1.505 TB/s | 73.79% | | F64 | I64 | false | 2^16 = 65536 | 0.000 | 50784x | 14.999 us | 52.49% | 9.848 us | 1.57% | 6.655G | 53.239 GB/s | 2.61% | | F64 | I64 | false | 2^20 = 1048576 | 0.000 | 29536x | 22.216 us | 31.34% | 16.931 us | 2.16% | 61.931G | 495.447 GB/s | 24.30% | | F64 | I64 | false | 2^24 = 16777216 | 0.000 | 4816x | 109.129 us | 5.25% | 103.834 us | 1.18% | 161.577G | 1.293 TB/s | 63.39% | | F64 | I64 | false | 2^28 = 268435456 | 0.000 | 2816x | 1.323 ms | 0.86% | 1.317 ms | 0.75% | 203.764G | 1.630 TB/s | 79.95% | | F64 | U64 | false | 2^16 = 65536 | 1.000 | 48032x | 15.635 us | 50.35% | 10.410 us | 1.77% | 6.295G | 100.726 GB/s | 4.94% | | F64 | U64 | false | 2^20 = 1048576 | 1.000 | 25904x | 24.618 us | 27.93% | 19.302 us | 4.32% | 54.324G | 869.179 GB/s | 42.63% | | F64 | U64 | false | 2^24 = 16777216 | 1.000 | 2912x | 177.789 us | 3.65% | 172.509 us | 1.97% | 97.254G | 1.556 TB/s | 76.31% | | F64 | U64 | false | 2^28 = 268435456 | 1.000 | 448x | 2.657 ms | 0.57% | 2.651 ms | 0.52% | 101.266G | 1.620 TB/s | 79.46% | | F64 | U64 | false | 2^16 = 65536 | 0.544 | 50256x | 15.124 us | 52.16% | 9.951 us | 2.11% | 6.586G | 57.402 GB/s | 2.82% | | F64 | U64 | false | 2^20 = 1048576 | 0.544 | 28976x | 22.534 us | 30.68% | 17.263 us | 2.16% | 60.741G | 528.798 GB/s | 25.93% | | F64 | U64 | false | 2^24 = 16777216 | 0.544 | 4480x | 117.176 us | 4.94% | 111.894 us | 1.40% | 149.939G | 1.305 TB/s | 64.01% | | F64 | U64 | false | 2^28 = 268435456 | 0.544 | 2128x | 1.559 ms | 0.70% | 1.553 ms | 0.60% | 172.839G | 1.504 TB/s | 73.78% | | F64 | U64 | false | 2^16 = 65536 | 0.000 | 49232x | 15.469 us | 52.53% | 10.157 us | 2.57% | 6.452G | 51.617 GB/s | 2.53% | | F64 | U64 | false | 2^20 = 1048576 | 0.000 | 29072x | 22.555 us | 31.29% | 17.199 us | 2.11% | 60.968G | 487.742 GB/s | 23.92% | | F64 | U64 | false | 2^24 = 16777216 | 0.000 | 4816x | 109.690 us | 5.45% | 104.150 us | 1.15% | 161.087G | 1.289 TB/s | 63.20% | | F64 | U64 | false | 2^28 = 268435456 | 0.000 | 2752x | 1.323 ms | 0.86% | 1.318 ms | 0.74% | 203.724G | 1.630 TB/s | 79.93% |
Performance numbers relative to baseline but using same policies for 32 and 64 bit offsets | T{ct} | OffsetT{ct} | IsInPlace{ct} | Elements{io} | Entropy | Ref Time | Ref Noise | Cmp Time | Cmp Noise | Diff | %Diff | Status | |---------|---------------|-----------------|----------------|-----------|------------|-------------|------------|-------------|------------|---------|----------| | I8 | I32 | false | 2^16 | 1 | 9.761 us | 2.25% | 9.769 us | 1.83% | 0.008 us | 0.08% | PASS | | I8 | I32 | false | 2^20 | 1 | 12.308 us | 1.90% | 12.203 us | 1.94% | -0.105 us | -0.85% | PASS | | I8 | I32 | false | 2^24 | 1 | 51.803 us | 1.28% | 51.801 us | 1.28% | -0.003 us | -0.01% | PASS | | I8 | I32 | false | 2^28 | 1 | 697.287 us | 0.50% | 697.065 us | 0.50% | -0.222 us | -0.03% | PASS | | I8 | I32 | false | 2^16 | 0.544 | 9.809 us | 2.29% | 9.858 us | 2.09% | 0.049 us | 0.50% | PASS | | I8 | I32 | false | 2^20 | 0.544 | 11.902 us | 2.06% | 11.946 us | 1.95% | 0.044 us | 0.37% | PASS | | I8 | I32 | false | 2^24 | 0.544 | 48.468 us | 1.10% | 48.391 us | 1.09% | -0.078 us | -0.16% | PASS | | I8 | I32 | false | 2^28 | 0.544 | 642.356 us | 0.50% | 642.228 us | 0.50% | -0.127 us | -0.02% | PASS | | I8 | I32 | false | 2^16 | 0 | 9.599 us | 2.23% | 9.623 us | 2.24% | 0.024 us | 0.25% | PASS | | I8 | I32 | false | 2^20 | 0 | 11.521 us | 2.11% | 11.477 us | 2.18% | -0.044 us | -0.39% | PASS | | I8 | I32 | false | 2^24 | 0 | 42.429 us | 1.14% | 42.482 us | 1.12% | 0.053 us | 0.13% | PASS | | I8 | I32 | false | 2^28 | 0 | 527.489 us | 0.50% | 527.411 us | 0.50% | -0.078 us | -0.01% | PASS | | I8 | U32 | false | 2^16 | 1 | 9.574 us | 2.14% | 9.585 us | 2.07% | 0.011 us | 0.11% | PASS | | I8 | U32 | false | 2^20 | 1 | 12.235 us | 1.98% | 12.411 us | 1.76% | 0.176 us | 1.44% | PASS | | I8 | U32 | false | 2^24 | 1 | 50.175 us | 1.23% | 50.274 us | 1.21% | 0.099 us | 0.20% | PASS | | I8 | U32 | false | 2^28 | 1 | 678.384 us | 0.50% | 678.118 us | 0.50% | -0.266 us | -0.04% | PASS | | I8 | U32 | false | 2^16 | 0.544 | 9.572 us | 2.30% | 9.479 us | 2.23% | -0.093 us | -0.97% | PASS | | I8 | U32 | false | 2^20 | 0.544 | 12.038 us | 2.06% | 12.014 us | 1.87% | -0.023 us | -0.20% | PASS | | I8 | U32 | false | 2^24 | 0.544 | 48.022 us | 1.09% | 47.983 us | 1.08% | -0.039 us | -0.08% | PASS | | I8 | U32 | false | 2^28 | 0.544 | 636.531 us | 0.50% | 636.642 us | 0.49% | 0.111 us | 0.02% | PASS | | I8 | U32 | false | 2^16 | 0 | 9.371 us | 2.29% | 9.366 us | 2.25% | -0.006 us | -0.06% | PASS | | I8 | U32 | false | 2^20 | 0 | 11.773 us | 1.80% | 11.810 us | 2.00% | 0.037 us | 0.32% | PASS | | I8 | U32 | false | 2^24 | 0 | 42.373 us | 1.14% | 42.402 us | 1.13% | 0.029 us | 0.07% | PASS | | I8 | U32 | false | 2^28 | 0 | 525.709 us | 0.50% | 525.691 us | 0.50% | -0.018 us | -0.00% | PASS | | I8 | I64 | false | 2^16 | 1 | 9.433 us | 2.32% | 9.594 us | 2.00% | 0.161 us | 1.70% | PASS | | I8 | I64 | false | 2^20 | 1 | 12.202 us | 1.58% | 12.536 us | 1.70% | 0.335 us | 2.74% | FAIL | | I8 | I64 | false | 2^24 | 1 | 62.936 us | 0.57% | 63.370 us | 0.67% | 0.434 us | 0.69% | FAIL | | I8 | I64 | false | 2^28 | 1 | 880.366 us | 0.20% | 893.315 us | 0.30% | 12.949 us | 1.47% | FAIL | | I8 | I64 | false | 2^16 | 0.544 | 9.363 us | 2.02% | 9.493 us | 2.10% | 0.130 us | 1.39% | PASS | | I8 | I64 | false | 2^20 | 0.544 | 11.916 us | 1.92% | 12.263 us | 2.05% | 0.347 us | 2.91% | FAIL | | I8 | I64 | false | 2^24 | 0.544 | 59.705 us | 0.60% | 60.093 us | 0.65% | 0.388 us | 0.65% | FAIL | | I8 | I64 | false | 2^28 | 0.544 | 816.964 us | 0.28% | 831.383 us | 0.33% | 14.419 us | 1.76% | FAIL | | I8 | I64 | false | 2^16 | 0 | 9.282 us | 1.87% | 9.453 us | 2.09% | 0.172 us | 1.85% | PASS | | I8 | I64 | false | 2^20 | 0 | 11.927 us | 2.54% | 12.106 us | 2.36% | 0.180 us | 1.51% | PASS | | I8 | I64 | false | 2^24 | 0 | 54.134 us | 0.64% | 54.191 us | 0.69% | 0.057 us | 0.11% | PASS | | I8 | I64 | false | 2^28 | 0 | 711.734 us | 0.38% | 712.957 us | 0.48% | 1.223 us | 0.17% | PASS | | I8 | U64 | false | 2^16 | 1 | 9.764 us | 1.71% | 9.934 us | 1.89% | 0.171 us | 1.75% | FAIL | | I8 | U64 | false | 2^20 | 1 | 12.488 us | 1.98% | 12.966 us | 2.20% | 0.478 us | 3.83% | FAIL | | I8 | U64 | false | 2^24 | 1 | 63.348 us | 0.56% | 63.871 us | 0.68% | 0.523 us | 0.82% | FAIL | | I8 | U64 | false | 2^28 | 1 | 880.735 us | 0.21% | 893.751 us | 0.30% | 13.016 us | 1.48% | FAIL | | I8 | U64 | false | 2^16 | 0.544 | 9.849 us | 1.60% | 10.072 us | 1.76% | 0.223 us | 2.26% | FAIL | | I8 | U64 | false | 2^20 | 0.544 | 12.031 us | 2.27% | 12.369 us | 2.51% | 0.338 us | 2.81% | FAIL | | I8 | U64 | false | 2^24 | 0.544 | 59.988 us | 0.65% | 60.320 us | 0.74% | 0.332 us | 0.55% | PASS | | I8 | U64 | false | 2^28 | 0.544 | 817.292 us | 0.27% | 831.632 us | 0.36% | 14.340 us | 1.75% | FAIL | | I8 | U64 | false | 2^16 | 0 | 9.840 us | 1.80% | 9.990 us | 1.79% | 0.150 us | 1.52% | PASS | | I8 | U64 | false | 2^20 | 0 | 11.743 us | 2.58% | 11.944 us | 2.62% | 0.201 us | 1.71% | PASS | | I8 | U64 | false | 2^24 | 0 | 54.085 us | 0.74% | 54.188 us | 0.79% | 0.103 us | 0.19% | PASS | | I8 | U64 | false | 2^28 | 0 | 711.693 us | 0.38% | 712.963 us | 0.47% | 1.270 us | 0.18% | PASS | | I16 | I32 | false | 2^16 | 1 | 10.225 us | 2.05% | 10.157 us | 2.30% | -0.068 us | -0.66% | PASS | | I16 | I32 | false | 2^20 | 1 | 13.317 us | 1.82% | 13.384 us | 1.43% | 0.067 us | 0.50% | PASS | | I16 | I32 | false | 2^24 | 1 | 63.162 us | 3.07% | 63.248 us | 3.09% | 0.086 us | 0.14% | PASS | | I16 | I32 | false | 2^28 | 1 | 876.298 us | 1.42% | 875.949 us | 1.51% | -0.349 us | -0.04% | PASS | | I16 | I32 | false | 2^16 | 0.544 | 10.258 us | 1.66% | 10.217 us | 1.76% | -0.041 us | -0.40% | PASS | | I16 | I32 | false | 2^20 | 0.544 | 13.444 us | 1.71% | 13.384 us | 2.10% | -0.060 us | -0.45% | PASS | | I16 | I32 | false | 2^24 | 0.544 | 56.111 us | 2.89% | 55.702 us | 2.90% | -0.409 us | -0.73% | PASS | | I16 | I32 | false | 2^28 | 0.544 | 701.473 us | 1.44% | 701.589 us | 1.53% | 0.116 us | 0.02% | PASS | | I16 | I32 | false | 2^16 | 0 | 10.142 us | 2.67% | 9.842 us | 2.32% | -0.300 us | -2.96% | FAIL | | I16 | I32 | false | 2^20 | 0 | 12.718 us | 2.01% | 12.536 us | 1.63% | -0.182 us | -1.43% | PASS | | I16 | I32 | false | 2^24 | 0 | 47.840 us | 2.53% | 47.611 us | 2.55% | -0.230 us | -0.48% | PASS | | I16 | I32 | false | 2^28 | 0 | 510.817 us | 0.99% | 510.336 us | 1.00% | -0.481 us | -0.09% | PASS | | I16 | U32 | false | 2^16 | 1 | 10.097 us | 2.50% | 9.972 us | 2.09% | -0.125 us | -1.24% | PASS | | I16 | U32 | false | 2^20 | 1 | 13.281 us | 1.47% | 13.064 us | 1.74% | -0.217 us | -1.64% | FAIL | | I16 | U32 | false | 2^24 | 1 | 62.790 us | 3.10% | 62.474 us | 3.10% | -0.316 us | -0.50% | PASS | | I16 | U32 | false | 2^28 | 1 | 870.367 us | 1.41% | 869.823 us | 1.46% | -0.544 us | -0.06% | PASS | | I16 | U32 | false | 2^16 | 0.544 | 10.254 us | 2.09% | 10.006 us | 1.90% | -0.248 us | -2.42% | FAIL | | I16 | U32 | false | 2^20 | 0.544 | 13.429 us | 2.23% | 13.141 us | 1.89% | -0.287 us | -2.14% | FAIL | | I16 | U32 | false | 2^24 | 0.544 | 55.857 us | 2.82% | 55.592 us | 2.87% | -0.265 us | -0.47% | PASS | | I16 | U32 | false | 2^28 | 0.544 | 698.129 us | 1.41% | 697.808 us | 1.47% | -0.321 us | -0.05% | PASS | | I16 | U32 | false | 2^16 | 0 | 10.144 us | 2.78% | 9.865 us | 2.19% | -0.278 us | -2.75% | FAIL | | I16 | U32 | false | 2^20 | 0 | 12.786 us | 1.67% | 12.550 us | 1.83% | -0.236 us | -1.85% | FAIL | | I16 | U32 | false | 2^24 | 0 | 47.768 us | 2.59% | 47.454 us | 2.58% | -0.314 us | -0.66% | PASS | | I16 | U32 | false | 2^28 | 0 | 510.231 us | 0.99% | 509.623 us | 1.02% | -0.608 us | -0.12% | PASS | | I16 | I64 | false | 2^16 | 1 | 9.877 us | 2.30% | 9.849 us | 2.34% | -0.028 us | -0.28% | PASS | | I16 | I64 | false | 2^20 | 1 | 13.568 us | 1.50% | 13.527 us | 1.46% | -0.041 us | -0.30% | PASS | | I16 | I64 | false | 2^24 | 1 | 64.672 us | 1.81% | 67.906 us | 2.56% | 3.235 us | 5.00% | FAIL | | I16 | I64 | false | 2^28 | 1 | 850.004 us | 0.84% | 930.638 us | 1.13% | 80.634 us | 9.49% | FAIL | | I16 | I64 | false | 2^16 | 0.544 | 9.620 us | 1.92% | 9.811 us | 2.58% | 0.192 us | 1.99% | FAIL | | I16 | I64 | false | 2^20 | 0.544 | 13.212 us | 1.34% | 13.662 us | 1.47% | 0.450 us | 3.40% | FAIL | | I16 | I64 | false | 2^24 | 0.544 | 58.392 us | 1.66% | 61.223 us | 2.24% | 2.831 us | 4.85% | FAIL | | I16 | I64 | false | 2^28 | 0.544 | 741.448 us | 0.61% | 786.068 us | 0.81% | 44.620 us | 6.02% | FAIL | | I16 | I64 | false | 2^16 | 0 | 9.472 us | 2.10% | 9.670 us | 2.27% | 0.198 us | 2.09% | PASS | | I16 | I64 | false | 2^20 | 0 | 12.810 us | 1.47% | 13.107 us | 1.56% | 0.296 us | 2.31% | FAIL | | I16 | I64 | false | 2^24 | 0 | 51.584 us | 1.55% | 54.009 us | 1.95% | 2.425 us | 4.70% | FAIL | | I16 | I64 | false | 2^28 | 0 | 607.408 us | 0.55% | 636.548 us | 0.64% | 29.140 us | 4.80% | FAIL | | I16 | U64 | false | 2^16 | 1 | 9.916 us | 1.97% | 9.984 us | 2.32% | 0.068 us | 0.69% | PASS | | I16 | U64 | false | 2^20 | 1 | 13.100 us | 1.64% | 13.481 us | 1.96% | 0.381 us | 2.91% | FAIL | | I16 | U64 | false | 2^24 | 1 | 64.267 us | 1.85% | 68.170 us | 2.57% | 3.903 us | 6.07% | FAIL | | I16 | U64 | false | 2^28 | 1 | 850.115 us | 0.88% | 930.460 us | 1.14% | 80.345 us | 9.45% | FAIL | | I16 | U64 | false | 2^16 | 0.544 | 9.814 us | 2.00% | 10.099 us | 2.03% | 0.285 us | 2.90% | FAIL | | I16 | U64 | false | 2^20 | 0.544 | 12.892 us | 1.58% | 13.311 us | 1.76% | 0.419 us | 3.25% | FAIL | | I16 | U64 | false | 2^24 | 0.544 | 58.437 us | 1.66% | 61.134 us | 2.27% | 2.698 us | 4.62% | FAIL | | I16 | U64 | false | 2^28 | 0.544 | 741.380 us | 0.60% | 786.573 us | 0.82% | 45.193 us | 6.10% | FAIL | | I16 | U64 | false | 2^16 | 0 | 9.742 us | 2.03% | 9.961 us | 1.96% | 0.219 us | 2.25% | FAIL | | I16 | U64 | false | 2^20 | 0 | 12.548 us | 1.82% | 12.908 us | 1.77% | 0.360 us | 2.87% | FAIL | | I16 | U64 | false | 2^24 | 0 | 51.592 us | 1.55% | 53.742 us | 1.94% | 2.150 us | 4.17% | FAIL | | I16 | U64 | false | 2^28 | 0 | 607.502 us | 0.53% | 636.639 us | 0.64% | 29.138 us | 4.80% | FAIL | | I32 | I32 | false | 2^16 | 1 | 10.446 us | 1.99% | 10.115 us | 1.92% | -0.332 us | -3.18% | FAIL | | I32 | I32 | false | 2^20 | 1 | 14.782 us | 1.57% | 14.750 us | 1.55% | -0.033 us | -0.22% | PASS | | I32 | I32 | false | 2^24 | 1 | 90.207 us | 2.63% | 90.144 us | 2.65% | -0.063 us | -0.07% | PASS | | I32 | I32 | false | 2^28 | 1 | 1.343 ms | 0.99% | 1.342 ms | 0.95% | -0.458 us | -0.03% | PASS | | I32 | I32 | false | 2^16 | 0.544 | 9.976 us | 2.12% | 9.772 us | 2.11% | -0.204 us | -2.05% | PASS | | I32 | I32 | false | 2^20 | 0.544 | 14.744 us | 1.69% | 14.622 us | 1.85% | -0.122 us | -0.83% | PASS | | I32 | I32 | false | 2^24 | 0.544 | 78.023 us | 2.70% | 78.300 us | 2.72% | 0.277 us | 0.35% | PASS | | I32 | I32 | false | 2^28 | 0.544 | 1.072 ms | 1.14% | 1.072 ms | 1.13% | -0.223 us | -0.02% | PASS | | I32 | I32 | false | 2^16 | 0 | 9.777 us | 1.68% | 9.606 us | 2.13% | -0.171 us | -1.75% | FAIL | | I32 | I32 | false | 2^20 | 0 | 13.948 us | 1.72% | 14.045 us | 1.59% | 0.097 us | 0.70% | PASS | | I32 | I32 | false | 2^24 | 0 | 62.353 us | 2.01% | 62.602 us | 2.06% | 0.249 us | 0.40% | PASS | | I32 | I32 | false | 2^28 | 0 | 649.132 us | 1.46% | 649.091 us | 1.46% | -0.041 us | -0.01% | PASS | | I32 | U32 | false | 2^16 | 1 | 9.880 us | 2.05% | 9.838 us | 2.02% | -0.042 us | -0.42% | PASS | | I32 | U32 | false | 2^20 | 1 | 14.916 us | 1.50% | 15.007 us | 1.39% | 0.091 us | 0.61% | PASS | | I32 | U32 | false | 2^24 | 1 | 90.100 us | 2.67% | 90.046 us | 2.68% | -0.054 us | -0.06% | PASS | | I32 | U32 | false | 2^28 | 1 | 1.342 ms | 0.97% | 1.342 ms | 0.99% | -0.156 us | -0.01% | PASS | | I32 | U32 | false | 2^16 | 0.544 | 9.829 us | 1.92% | 9.761 us | 1.99% | -0.068 us | -0.69% | PASS | | I32 | U32 | false | 2^20 | 0.544 | 14.687 us | 1.78% | 14.779 us | 1.61% | 0.092 us | 0.63% | PASS | | I32 | U32 | false | 2^24 | 0.544 | 78.080 us | 2.65% | 78.326 us | 2.64% | 0.246 us | 0.32% | PASS | | I32 | U32 | false | 2^28 | 0.544 | 1.070 ms | 1.14% | 1.071 ms | 1.11% | 0.337 us | 0.03% | PASS | | I32 | U32 | false | 2^16 | 0 | 9.598 us | 2.16% | 9.572 us | 2.19% | -0.026 us | -0.27% | PASS | | I32 | U32 | false | 2^20 | 0 | 13.948 us | 1.40% | 14.117 us | 1.49% | 0.168 us | 1.21% | PASS | | I32 | U32 | false | 2^24 | 0 | 62.565 us | 2.02% | 62.559 us | 2.05% | -0.005 us | -0.01% | PASS | | I32 | U32 | false | 2^28 | 0 | 648.897 us | 1.44% | 649.470 us | 1.47% | 0.573 us | 0.09% | PASS | | I32 | I64 | false | 2^16 | 1 | 10.245 us | 1.59% | 10.366 us | 2.53% | 0.121 us | 1.18% | PASS | | I32 | I64 | false | 2^20 | 1 | 15.483 us | 1.73% | 15.524 us | 1.72% | 0.041 us | 0.26% | PASS | | I32 | I64 | false | 2^24 | 1 | 93.141 us | 1.55% | 96.674 us | 2.31% | 3.533 us | 3.79% | FAIL | | I32 | I64 | false | 2^28 | 1 | 1.354 ms | 0.69% | 1.408 ms | 0.81% | 53.905 us | 3.98% | FAIL | | I32 | I64 | false | 2^16 | 0.544 | 10.119 us | 1.54% | 10.312 us | 2.71% | 0.193 us | 1.91% | FAIL | | I32 | I64 | false | 2^20 | 0.544 | 15.171 us | 1.79% | 15.202 us | 1.48% | 0.031 us | 0.20% | PASS | | I32 | I64 | false | 2^24 | 0.544 | 82.308 us | 1.27% | 84.217 us | 1.90% | 1.909 us | 2.32% | FAIL | | I32 | I64 | false | 2^28 | 0.544 | 1.133 ms | 0.66% | 1.153 ms | 0.74% | 19.947 us | 1.76% | FAIL | | I32 | I64 | false | 2^16 | 0 | 10.123 us | 2.50% | 10.224 us | 1.89% | 0.100 us | 0.99% | PASS | | I32 | I64 | false | 2^20 | 0 | 14.119 us | 1.33% | 14.547 us | 1.67% | 0.428 us | 3.03% | FAIL | | I32 | I64 | false | 2^24 | 0 | 69.230 us | 0.96% | 69.995 us | 1.20% | 0.765 us | 1.10% | FAIL | | I32 | I64 | false | 2^28 | 0 | 788.107 us | 0.82% | 806.833 us | 0.82% | 18.726 us | 2.38% | FAIL | | I32 | U64 | false | 2^16 | 1 | 10.512 us | 2.60% | 10.575 us | 2.56% | 0.063 us | 0.60% | PASS | | I32 | U64 | false | 2^20 | 1 | 15.285 us | 1.97% | 15.333 us | 2.02% | 0.048 us | 0.32% | PASS | | I32 | U64 | false | 2^24 | 1 | 93.131 us | 1.58% | 96.669 us | 2.36% | 3.538 us | 3.80% | FAIL | | I32 | U64 | false | 2^28 | 1 | 1.354 ms | 0.69% | 1.407 ms | 0.85% | 52.946 us | 3.91% | FAIL | | I32 | U64 | false | 2^16 | 0.544 | 10.472 us | 2.08% | 10.412 us | 2.63% | -0.060 us | -0.57% | PASS | | I32 | U64 | false | 2^20 | 0.544 | 14.807 us | 1.51% | 14.946 us | 1.54% | 0.139 us | 0.94% | PASS | | I32 | U64 | false | 2^24 | 0.544 | 82.310 us | 1.25% | 84.192 us | 1.90% | 1.882 us | 2.29% | FAIL | | I32 | U64 | false | 2^28 | 0.544 | 1.133 ms | 0.68% | 1.153 ms | 0.71% | 19.808 us | 1.75% | FAIL | | I32 | U64 | false | 2^16 | 0 | 9.795 us | 2.10% | 10.293 us | 2.15% | 0.498 us | 5.09% | FAIL | | I32 | U64 | false | 2^20 | 0 | 13.950 us | 1.37% | 14.417 us | 1.33% | 0.468 us | 3.35% | FAIL | | I32 | U64 | false | 2^24 | 0 | 68.756 us | 0.97% | 69.853 us | 1.22% | 1.098 us | 1.60% | FAIL | | I32 | U64 | false | 2^28 | 0 | 788.785 us | 0.81% | 806.894 us | 0.83% | 18.109 us | 2.30% | FAIL | | I64 | I32 | false | 2^16 | 1 | 10.289 us | 1.85% | 10.589 us | 1.82% | 0.300 us | 2.91% | FAIL | | I64 | I32 | false | 2^20 | 1 | 18.028 us | 1.35% | 18.283 us | 1.24% | 0.255 us | 1.42% | FAIL | | I64 | I32 | false | 2^24 | 1 | 165.528 us | 1.92% | 165.891 us | 1.87% | 0.363 us | 0.22% | PASS | | I64 | I32 | false | 2^28 | 1 | 2.559 ms | 0.53% | 2.558 ms | 0.54% | -0.770 us | -0.03% | PASS | | I64 | I32 | false | 2^16 | 0.544 | 10.211 us | 2.13% | 10.504 us | 1.91% | 0.293 us | 2.87% | FAIL | | I64 | I32 | false | 2^20 | 0.544 | 17.651 us | 1.28% | 17.953 us | 1.76% | 0.302 us | 1.71% | FAIL | | I64 | I32 | false | 2^24 | 0.544 | 138.176 us | 2.58% | 138.601 us | 2.61% | 0.425 us | 0.31% | PASS | | I64 | I32 | false | 2^28 | 0.544 | 2.025 ms | 0.76% | 2.026 ms | 0.77% | 0.741 us | 0.04% | PASS | | I64 | I32 | false | 2^16 | 0 | 9.702 us | 2.03% | 10.070 us | 2.42% | 0.368 us | 3.79% | FAIL | | I64 | I32 | false | 2^20 | 0 | 16.361 us | 1.28% | 16.695 us | 1.62% | 0.334 us | 2.04% | FAIL | | I64 | I32 | false | 2^24 | 0 | 97.269 us | 1.83% | 97.655 us | 1.83% | 0.386 us | 0.40% | PASS | | I64 | I32 | false | 2^28 | 0 | 1.231 ms | 0.86% | 1.231 ms | 0.90% | 0.277 us | 0.02% | PASS | | I64 | U32 | false | 2^16 | 1 | 10.303 us | 1.99% | 10.442 us | 2.24% | 0.138 us | 1.34% | PASS | | I64 | U32 | false | 2^20 | 1 | 17.992 us | 1.32% | 18.191 us | 1.38% | 0.199 us | 1.11% | PASS | | I64 | U32 | false | 2^24 | 1 | 165.426 us | 1.89% | 165.697 us | 1.90% | 0.271 us | 0.16% | PASS | | I64 | U32 | false | 2^28 | 1 | 2.556 ms | 0.47% | 2.559 ms | 0.49% | 3.837 us | 0.15% | PASS | | I64 | U32 | false | 2^16 | 0.544 | 10.070 us | 1.76% | 10.346 us | 1.40% | 0.276 us | 2.75% | FAIL | | I64 | U32 | false | 2^20 | 0.544 | 18.264 us | 1.32% | 18.278 us | 1.32% | 0.015 us | 0.08% | PASS | | I64 | U32 | false | 2^24 | 0.544 | 138.492 us | 2.54% | 138.624 us | 2.62% | 0.132 us | 0.10% | PASS | | I64 | U32 | false | 2^28 | 0.544 | 2.028 ms | 0.80% | 2.027 ms | 0.76% | -0.613 us | -0.03% | PASS | | I64 | U32 | false | 2^16 | 0 | 9.818 us | 2.17% | 9.851 us | 2.16% | 0.033 us | 0.33% | PASS | | I64 | U32 | false | 2^20 | 0 | 16.774 us | 1.55% | 16.895 us | 1.26% | 0.120 us | 0.72% | PASS | | I64 | U32 | false | 2^24 | 0 | 97.699 us | 1.83% | 97.741 us | 1.86% | 0.042 us | 0.04% | PASS | | I64 | U32 | false | 2^28 | 0 | 1.232 ms | 0.91% | 1.232 ms | 0.87% | 0.073 us | 0.01% | PASS | | I64 | I64 | false | 2^16 | 1 | 10.687 us | 2.00% | 10.505 us | 2.22% | -0.182 us | -1.70% | PASS | | I64 | I64 | false | 2^20 | 1 | 19.573 us | 2.88% | 19.794 us | 3.51% | 0.221 us | 1.13% | PASS | | I64 | I64 | false | 2^24 | 1 | 166.465 us | 1.39% | 173.111 us | 1.96% | 6.646 us | 3.99% | FAIL | | I64 | I64 | false | 2^28 | 1 | 2.547 ms | 0.39% | 2.658 ms | 0.54% | 111.524 us | 4.38% | FAIL | | I64 | I64 | false | 2^16 | 0.544 | 10.587 us | 1.80% | 10.678 us | 1.71% | 0.091 us | 0.86% | PASS | | I64 | I64 | false | 2^20 | 0.544 | 19.058 us | 2.17% | 19.194 us | 2.48% | 0.136 us | 0.71% | PASS | | I64 | I64 | false | 2^24 | 0.544 | 139.905 us | 1.48% | 143.226 us | 1.94% | 3.320 us | 2.37% | FAIL | | I64 | I64 | false | 2^28 | 0.544 | 2.074 ms | 0.50% | 2.123 ms | 0.53% | 49.089 us | 2.37% | FAIL | | I64 | I64 | false | 2^16 | 0 | 9.846 us | 2.23% | 9.884 us | 1.99% | 0.038 us | 0.39% | PASS | | I64 | I64 | false | 2^20 | 0 | 17.616 us | 1.90% | 17.831 us | 2.00% | 0.214 us | 1.22% | PASS | | I64 | I64 | false | 2^24 | 0 | 104.476 us | 0.93% | 105.704 us | 1.15% | 1.228 us | 1.18% | FAIL | | I64 | I64 | false | 2^28 | 0 | 1.332 ms | 0.66% | 1.348 ms | 0.68% | 16.738 us | 1.26% | FAIL | | I64 | U64 | false | 2^16 | 1 | 10.521 us | 2.03% | 10.713 us | 2.14% | 0.192 us | 1.83% | PASS | | I64 | U64 | false | 2^20 | 1 | 19.801 us | 2.98% | 19.818 us | 3.47% | 0.017 us | 0.09% | PASS | | I64 | U64 | false | 2^24 | 1 | 166.210 us | 1.38% | 172.740 us | 1.87% | 6.530 us | 3.93% | FAIL | | I64 | U64 | false | 2^28 | 1 | 2.547 ms | 0.42% | 2.659 ms | 0.48% | 112.441 us | 4.42% | FAIL | | I64 | U64 | false | 2^16 | 0.544 | 10.222 us | 2.19% | 10.276 us | 1.89% | 0.053 us | 0.52% | PASS | | I64 | U64 | false | 2^20 | 0.544 | 18.897 us | 2.14% | 19.031 us | 2.50% | 0.134 us | 0.71% | PASS | | I64 | U64 | false | 2^24 | 0.544 | 139.701 us | 1.53% | 143.031 us | 1.97% | 3.330 us | 2.38% | FAIL | | I64 | U64 | false | 2^28 | 0.544 | 2.072 ms | 0.50% | 2.122 ms | 0.53% | 49.485 us | 2.39% | FAIL | | I64 | U64 | false | 2^16 | 0 | 9.945 us | 1.68% | 9.895 us | 1.57% | -0.051 us | -0.51% | PASS | | I64 | U64 | false | 2^20 | 0 | 17.128 us | 1.86% | 17.214 us | 2.00% | 0.086 us | 0.50% | PASS | | I64 | U64 | false | 2^24 | 0 | 104.278 us | 0.92% | 105.366 us | 1.13% | 1.087 us | 1.04% | FAIL | | I64 | U64 | false | 2^28 | 0 | 1.331 ms | 0.67% | 1.348 ms | 0.67% | 16.774 us | 1.26% | FAIL | | I128 | I32 | false | 2^16 | 1 | 11.512 us | 1.83% | 11.483 us | 1.62% | -0.030 us | -0.26% | PASS | | I128 | I32 | false | 2^20 | 1 | 30.330 us | 2.08% | 30.345 us | 2.12% | 0.015 us | 0.05% | PASS | | I128 | I32 | false | 2^24 | 1 | 342.639 us | 1.35% | 342.597 us | 1.34% | -0.042 us | -0.01% | PASS | | I128 | I32 | false | 2^28 | 1 | 5.353 ms | 0.42% | 5.350 ms | 0.33% | -3.051 us | -0.06% | PASS | | I128 | I32 | false | 2^16 | 0.544 | 11.360 us | 2.05% | 11.374 us | 2.00% | 0.014 us | 0.12% | PASS | | I128 | I32 | false | 2^20 | 0.544 | 28.207 us | 2.48% | 28.367 us | 2.65% | 0.160 us | 0.57% | PASS | | I128 | I32 | false | 2^24 | 0.544 | 279.679 us | 1.46% | 279.740 us | 1.46% | 0.062 us | 0.02% | PASS | | I128 | I32 | false | 2^28 | 0.544 | 4.274 ms | 0.38% | 4.276 ms | 0.37% | 1.514 us | 0.04% | PASS | | I128 | I32 | false | 2^16 | 0 | 10.460 us | 2.09% | 10.510 us | 2.00% | 0.051 us | 0.49% | PASS | | I128 | I32 | false | 2^20 | 0 | 25.833 us | 2.31% | 25.776 us | 2.31% | -0.057 us | -0.22% | PASS | | I128 | I32 | false | 2^24 | 0 | 183.330 us | 1.17% | 183.165 us | 1.13% | -0.165 us | -0.09% | PASS | | I128 | I32 | false | 2^28 | 0 | 2.514 ms | 0.28% | 2.512 ms | 0.25% | -1.393 us | -0.06% | PASS | | I128 | U32 | false | 2^16 | 1 | 11.092 us | 1.97% | 11.126 us | 1.95% | 0.035 us | 0.31% | PASS | | I128 | U32 | false | 2^20 | 1 | 30.286 us | 2.41% | 30.349 us | 2.41% | 0.063 us | 0.21% | PASS | | I128 | U32 | false | 2^24 | 1 | 344.316 us | 1.30% | 344.605 us | 1.30% | 0.289 us | 0.08% | PASS | | I128 | U32 | false | 2^28 | 1 | 5.378 ms | 0.37% | 5.382 ms | 0.33% | 3.682 us | 0.07% | PASS | | I128 | U32 | false | 2^16 | 0.544 | 10.912 us | 2.09% | 10.958 us | 1.83% | 0.046 us | 0.43% | PASS | | I128 | U32 | false | 2^20 | 0.544 | 28.508 us | 2.80% | 28.462 us | 2.83% | -0.046 us | -0.16% | PASS | | I128 | U32 | false | 2^24 | 0.544 | 281.121 us | 1.47% | 281.105 us | 1.48% | -0.016 us | -0.01% | PASS | | I128 | U32 | false | 2^28 | 0.544 | 4.302 ms | 0.43% | 4.301 ms | 0.39% | -0.435 us | -0.01% | PASS | | I128 | U32 | false | 2^16 | 0 | 10.143 us | 2.12% | 10.168 us | 2.22% | 0.025 us | 0.25% | PASS | | I128 | U32 | false | 2^20 | 0 | 25.930 us | 2.46% | 25.820 us | 2.48% | -0.111 us | -0.43% | PASS | | I128 | U32 | false | 2^24 | 0 | 184.049 us | 1.19% | 183.907 us | 1.17% | -0.142 us | -0.08% | PASS | | I128 | U32 | false | 2^28 | 0 | 2.528 ms | 0.27% | 2.529 ms | 0.29% | 0.487 us | 0.02% | PASS | | I128 | I64 | false | 2^16 | 1 | 11.463 us | 1.69% | 11.659 us | 1.83% | 0.196 us | 1.71% | FAIL | | I128 | I64 | false | 2^20 | 1 | 30.141 us | 1.72% | 30.510 us | 2.46% | 0.369 us | 1.22% | PASS | | I128 | I64 | false | 2^24 | 1 | 334.873 us | 1.06% | 347.250 us | 1.19% | 12.377 us | 3.70% | FAIL | | I128 | I64 | false | 2^28 | 1 | 5.220 ms | 0.29% | 5.427 ms | 0.33% | 206.562 us | 3.96% | FAIL | | I128 | I64 | false | 2^16 | 0.544 | 11.234 us | 1.50% | 11.396 us | 1.69% | 0.162 us | 1.44% | PASS | | I128 | I64 | false | 2^20 | 0.544 | 28.142 us | 1.79% | 28.534 us | 2.93% | 0.391 us | 1.39% | PASS | | I128 | I64 | false | 2^24 | 0.544 | 275.627 us | 1.17% | 284.354 us | 1.37% | 8.727 us | 3.17% | FAIL | | I128 | I64 | false | 2^28 | 0.544 | 4.229 ms | 0.30% | 4.359 ms | 0.36% | 129.179 us | 3.05% | FAIL | | I128 | I64 | false | 2^16 | 0 | 10.480 us | 1.77% | 10.750 us | 1.81% | 0.270 us | 2.58% | FAIL | | I128 | I64 | false | 2^20 | 0 | 25.738 us | 1.91% | 26.315 us | 2.23% | 0.578 us | 2.24% | FAIL | | I128 | I64 | false | 2^24 | 0 | 184.332 us | 0.84% | 189.124 us | 1.05% | 4.792 us | 2.60% | FAIL | | I128 | I64 | false | 2^28 | 0 | 2.534 ms | 0.20% | 2.608 ms | 0.25% | 74.198 us | 2.93% | FAIL | | I128 | U64 | false | 2^16 | 1 | 11.671 us | 2.16% | 11.802 us | 2.13% | 0.131 us | 1.12% | PASS | | I128 | U64 | false | 2^20 | 1 | 30.457 us | 1.66% | 30.824 us | 2.41% | 0.367 us | 1.21% | PASS | | I128 | U64 | false | 2^24 | 1 | 335.663 us | 1.09% | 347.450 us | 1.18% | 11.787 us | 3.51% | FAIL | | I128 | U64 | false | 2^28 | 1 | 5.216 ms | 0.28% | 5.431 ms | 0.34% | 214.198 us | 4.11% | FAIL | | I128 | U64 | false | 2^16 | 0.544 | 11.515 us | 1.71% | 11.719 us | 2.37% | 0.204 us | 1.77% | FAIL | | I128 | U64 | false | 2^20 | 0.544 | 28.302 us | 1.81% | 28.910 us | 2.92% | 0.607 us | 2.15% | FAIL | | I128 | U64 | false | 2^24 | 0.544 | 276.116 us | 1.21% | 284.517 us | 1.31% | 8.400 us | 3.04% | FAIL | | I128 | U64 | false | 2^28 | 0.544 | 4.230 ms | 0.34% | 4.360 ms | 0.34% | 130.079 us | 3.08% | FAIL | | I128 | U64 | false | 2^16 | 0 | 10.901 us | 2.07% | 10.990 us | 1.80% | 0.089 us | 0.82% | PASS | | I128 | U64 | false | 2^20 | 0 | 25.518 us | 2.08% | 26.448 us | 2.33% | 0.930 us | 3.64% | FAIL | | I128 | U64 | false | 2^24 | 0 | 184.395 us | 0.83% | 189.099 us | 1.05% | 4.704 us | 2.55% | FAIL | | I128 | U64 | false | 2^28 | 0 | 2.535 ms | 0.21% | 2.607 ms | 0.24% | 72.378 us | 2.86% | FAIL | | F32 | I32 | false | 2^16 | 1 | 10.428 us | 1.80% | 10.435 us | 2.63% | 0.007 us | 0.07% | PASS | | F32 | I32 | false | 2^20 | 1 | 15.055 us | 1.63% | 14.974 us | 1.38% | -0.081 us | -0.54% | PASS | | F32 | I32 | false | 2^24 | 1 | 90.431 us | 2.66% | 90.477 us | 2.68% | 0.046 us | 0.05% | PASS | | F32 | I32 | false | 2^28 | 1 | 1.365 ms | 0.99% | 1.365 ms | 0.95% | 0.520 us | 0.04% | PASS | | F32 | I32 | false | 2^16 | 0.544 | 10.132 us | 2.49% | 10.039 us | 1.72% | -0.092 us | -0.91% | PASS | | F32 | I32 | false | 2^20 | 0.544 | 14.103 us | 1.37% | 14.019 us | 1.94% | -0.084 us | -0.60% | PASS | | F32 | I32 | false | 2^24 | 0.544 | 64.156 us | 2.15% | 63.994 us | 2.12% | -0.162 us | -0.25% | PASS | | F32 | I32 | false | 2^28 | 0.544 | 772.320 us | 1.04% | 772.284 us | 1.07% | -0.036 us | -0.00% | PASS | | F32 | I32 | false | 2^16 | 0 | 9.636 us | 2.06% | 9.527 us | 2.45% | -0.108 us | -1.12% | PASS | | F32 | I32 | false | 2^20 | 0 | 14.098 us | 1.85% | 14.043 us | 1.80% | -0.055 us | -0.39% | PASS | | F32 | I32 | false | 2^24 | 0 | 62.238 us | 2.04% | 62.305 us | 2.05% | 0.067 us | 0.11% | PASS | | F32 | I32 | false | 2^28 | 0 | 649.088 us | 1.43% | 649.292 us | 1.43% | 0.204 us | 0.03% | PASS | | F32 | U32 | false | 2^16 | 1 | 9.831 us | 1.98% | 9.838 us | 1.89% | 0.007 us | 0.07% | PASS | | F32 | U32 | false | 2^20 | 1 | 14.821 us | 1.34% | 14.905 us | 1.23% | 0.084 us | 0.56% | PASS | | F32 | U32 | false | 2^24 | 1 | 90.143 us | 2.66% | 89.919 us | 2.64% | -0.224 us | -0.25% | PASS | | F32 | U32 | false | 2^28 | 1 | 1.363 ms | 0.99% | 1.362 ms | 0.99% | -0.369 us | -0.03% | PASS | | F32 | U32 | false | 2^16 | 0.544 | 9.471 us | 1.94% | 9.462 us | 2.07% | -0.010 us | -0.10% | PASS | | F32 | U32 | false | 2^20 | 0.544 | 13.992 us | 1.78% | 14.074 us | 1.64% | 0.082 us | 0.59% | PASS | | F32 | U32 | false | 2^24 | 0.544 | 63.913 us | 2.16% | 64.029 us | 2.17% | 0.116 us | 0.18% | PASS | | F32 | U32 | false | 2^28 | 0.544 | 771.119 us | 1.03% | 770.927 us | 1.04% | -0.193 us | -0.03% | PASS | | F32 | U32 | false | 2^16 | 0 | 9.548 us | 2.08% | 9.478 us | 2.12% | -0.070 us | -0.74% | PASS | | F32 | U32 | false | 2^20 | 0 | 14.103 us | 1.62% | 14.087 us | 1.26% | -0.016 us | -0.11% | PASS | | F32 | U32 | false | 2^24 | 0 | 62.239 us | 2.05% | 62.291 us | 2.08% | 0.052 us | 0.08% | PASS | | F32 | U32 | false | 2^28 | 0 | 649.220 us | 1.44% | 649.568 us | 1.45% | 0.348 us | 0.05% | PASS | | F32 | I64 | false | 2^16 | 1 | 9.957 us | 2.02% | 9.956 us | 1.91% | -0.001 us | -0.01% | PASS | | F32 | I64 | false | 2^20 | 1 | 15.301 us | 1.59% | 15.310 us | 1.47% | 0.009 us | 0.06% | PASS | | F32 | I64 | false | 2^24 | 1 | 92.832 us | 1.56% | 96.061 us | 2.38% | 3.230 us | 3.48% | FAIL | | F32 | I64 | false | 2^28 | 1 | 1.370 ms | 0.66% | 1.433 ms | 0.82% | 62.645 us | 4.57% | FAIL | | F32 | I64 | false | 2^16 | 0.544 | 9.519 us | 2.07% | 9.969 us | 2.21% | 0.450 us | 4.73% | FAIL | | F32 | I64 | false | 2^20 | 0.544 | 14.245 us | 1.48% | 14.554 us | 1.39% | 0.309 us | 2.17% | FAIL | | F32 | I64 | false | 2^24 | 0.544 | 70.411 us | 0.97% | 71.485 us | 1.27% | 1.074 us | 1.52% | FAIL | | F32 | I64 | false | 2^28 | 0.544 | 863.031 us | 0.66% | 879.428 us | 0.69% | 16.397 us | 1.90% | FAIL | | F32 | I64 | false | 2^16 | 0 | 9.689 us | 2.75% | 9.817 us | 2.41% | 0.128 us | 1.33% | PASS | | F32 | I64 | false | 2^20 | 0 | 14.294 us | 1.40% | 14.632 us | 1.54% | 0.338 us | 2.37% | FAIL | | F32 | I64 | false | 2^24 | 0 | 68.824 us | 0.95% | 69.812 us | 1.23% | 0.988 us | 1.44% | FAIL | | F32 | I64 | false | 2^28 | 0 | 787.897 us | 0.83% | 804.417 us | 0.85% | 16.521 us | 2.10% | FAIL | | F32 | U64 | false | 2^16 | 1 | 10.280 us | 1.85% | 10.498 us | 1.72% | 0.219 us | 2.13% | FAIL | | F32 | U64 | false | 2^20 | 1 | 15.044 us | 1.48% | 15.465 us | 1.69% | 0.421 us | 2.80% | FAIL | | F32 | U64 | false | 2^24 | 1 | 92.708 us | 1.58% | 96.570 us | 2.34% | 3.862 us | 4.17% | FAIL | | F32 | U64 | false | 2^28 | 1 | 1.370 ms | 0.68% | 1.433 ms | 0.82% | 63.870 us | 4.66% | FAIL | | F32 | U64 | false | 2^16 | 0.544 | 9.755 us | 2.08% | 10.012 us | 2.21% | 0.258 us | 2.64% | FAIL | | F32 | U64 | false | 2^20 | 0.544 | 14.048 us | 1.49% | 14.350 us | 1.30% | 0.302 us | 2.15% | FAIL | | F32 | U64 | false | 2^24 | 0.544 | 70.345 us | 0.96% | 71.381 us | 1.34% | 1.036 us | 1.47% | FAIL | | F32 | U64 | false | 2^28 | 0.544 | 862.962 us | 0.66% | 879.305 us | 0.70% | 16.344 us | 1.89% | FAIL | | F32 | U64 | false | 2^16 | 0 | 9.787 us | 2.82% | 10.081 us | 2.76% | 0.294 us | 3.00% | FAIL | | F32 | U64 | false | 2^20 | 0 | 14.073 us | 1.42% | 14.397 us | 1.33% | 0.324 us | 2.30% | FAIL | | F32 | U64 | false | 2^24 | 0 | 68.765 us | 0.96% | 69.746 us | 1.19% | 0.981 us | 1.43% | FAIL | | F32 | U64 | false | 2^28 | 0 | 787.756 us | 0.82% | 804.095 us | 0.83% | 16.339 us | 2.07% | FAIL | | F64 | I32 | false | 2^16 | 1 | 10.197 us | 1.85% | 10.476 us | 2.23% | 0.279 us | 2.74% | FAIL | | F64 | I32 | false | 2^20 | 1 | 18.346 us | 1.37% | 17.947 us | 1.49% | -0.399 us | -2.18% | FAIL | | F64 | I32 | false | 2^24 | 1 | 165.419 us | 1.90% | 165.272 us | 1.92% | -0.148 us | -0.09% | PASS | | F64 | I32 | false | 2^28 | 1 | 2.557 ms | 0.52% | 2.555 ms | 0.55% | -1.673 us | -0.07% | PASS | | F64 | I32 | false | 2^16 | 0.544 | 10.153 us | 2.18% | 9.770 us | 1.98% | -0.383 us | -3.77% | FAIL | | F64 | I32 | false | 2^20 | 0.544 | 16.904 us | 1.17% | 16.719 us | 1.29% | -0.185 us | -1.09% | PASS | | F64 | I32 | false | 2^24 | 0.544 | 107.603 us | 2.41% | 107.361 us | 2.43% | -0.241 us | -0.22% | PASS | | F64 | I32 | false | 2^28 | 0.544 | 1.465 ms | 0.83% | 1.465 ms | 0.86% | -0.330 us | -0.02% | PASS | | F64 | I32 | false | 2^16 | 0 | 9.976 us | 1.61% | 9.610 us | 1.71% | -0.365 us | -3.66% | FAIL | | F64 | I32 | false | 2^20 | 0 | 16.348 us | 1.49% | 16.246 us | 1.41% | -0.102 us | -0.62% | PASS | | F64 | I32 | false | 2^24 | 0 | 97.513 us | 1.79% | 97.188 us | 1.82% | -0.326 us | -0.33% | PASS | | F64 | I32 | false | 2^28 | 0 | 1.228 ms | 0.88% | 1.228 ms | 0.89% | -0.007 us | -0.00% | PASS | | F64 | U32 | false | 2^16 | 1 | 10.387 us | 2.39% | 10.191 us | 2.14% | -0.196 us | -1.89% | PASS | | F64 | U32 | false | 2^20 | 1 | 18.145 us | 1.81% | 17.930 us | 1.38% | -0.215 us | -1.19% | PASS | | F64 | U32 | false | 2^24 | 1 | 165.357 us | 1.95% | 165.186 us | 1.94% | -0.171 us | -0.10% | PASS | | F64 | U32 | false | 2^28 | 1 | 2.552 ms | 0.52% | 2.554 ms | 0.56% | 1.200 us | 0.05% | PASS | | F64 | U32 | false | 2^16 | 0.544 | 9.691 us | 2.83% | 9.476 us | 2.14% | -0.215 us | -2.22% | FAIL | | F64 | U32 | false | 2^20 | 0.544 | 17.098 us | 1.27% | 16.779 us | 1.38% | -0.319 us | -1.87% | FAIL | | F64 | U32 | false | 2^24 | 0.544 | 107.126 us | 2.38% | 107.064 us | 2.40% | -0.062 us | -0.06% | PASS | | F64 | U32 | false | 2^28 | 0.544 | 1.460 ms | 0.82% | 1.460 ms | 0.83% | -0.186 us | -0.01% | PASS | | F64 | U32 | false | 2^16 | 0 | 9.633 us | 1.64% | 9.384 us | 1.92% | -0.248 us | -2.58% | FAIL | | F64 | U32 | false | 2^20 | 0 | 16.747 us | 1.70% | 16.508 us | 1.39% | -0.239 us | -1.43% | FAIL | | F64 | U32 | false | 2^24 | 0 | 97.519 us | 1.85% | 97.100 us | 1.84% | -0.418 us | -0.43% | PASS | | F64 | U32 | false | 2^28 | 0 | 1.228 ms | 0.87% | 1.228 ms | 0.89% | -0.292 us | -0.02% | PASS | | F64 | I64 | false | 2^16 | 1 | 10.413 us | 2.88% | 10.247 us | 1.99% | -0.165 us | -1.59% | PASS | | F64 | I64 | false | 2^20 | 1 | 19.566 us | 3.01% | 19.471 us | 3.73% | -0.095 us | -0.49% | PASS | | F64 | I64 | false | 2^24 | 1 | 165.995 us | 1.42% | 172.439 us | 1.90% | 6.444 us | 3.88% | FAIL | | F64 | I64 | false | 2^28 | 1 | 2.539 ms | 0.41% | 2.651 ms | 0.49% | 112.249 us | 4.42% | FAIL | | F64 | I64 | false | 2^16 | 0.544 | 9.992 us | 2.85% | 9.676 us | 1.96% | -0.316 us | -3.16% | FAIL | | F64 | I64 | false | 2^20 | 0.544 | 17.697 us | 1.98% | 17.453 us | 2.01% | -0.244 us | -1.38% | PASS | | F64 | I64 | false | 2^24 | 0.544 | 110.023 us | 1.07% | 111.841 us | 1.38% | 1.817 us | 1.65% | FAIL | | F64 | I64 | false | 2^28 | 0.544 | 1.541 ms | 0.59% | 1.553 ms | 0.61% | 12.325 us | 0.80% | FAIL | | F64 | I64 | false | 2^16 | 0 | 10.167 us | 2.48% | 9.848 us | 1.57% | -0.319 us | -3.14% | FAIL | | F64 | I64 | false | 2^20 | 0 | 17.280 us | 1.76% | 16.931 us | 2.16% | -0.348 us | -2.02% | FAIL | | F64 | I64 | false | 2^24 | 0 | 103.069 us | 0.97% | 103.834 us | 1.18% | 0.765 us | 0.74% | PASS | | F64 | I64 | false | 2^28 | 0 | 1.300 ms | 0.72% | 1.317 ms | 0.75% | 17.585 us | 1.35% | FAIL | | F64 | U64 | false | 2^16 | 1 | 10.718 us | 1.95% | 10.410 us | 1.77% | -0.308 us | -2.87% | FAIL | | F64 | U64 | false | 2^20 | 1 | 19.298 us | 3.16% | 19.302 us | 4.32% | 0.005 us | 0.02% | PASS | | F64 | U64 | false | 2^24 | 1 | 166.001 us | 1.37% | 172.509 us | 1.97% | 6.508 us | 3.92% | FAIL | | F64 | U64 | false | 2^28 | 1 | 2.540 ms | 0.44% | 2.651 ms | 0.52% | 110.470 us | 4.35% | FAIL | | F64 | U64 | false | 2^16 | 0.544 | 10.189 us | 2.93% | 9.951 us | 2.11% | -0.237 us | -2.33% | FAIL | | F64 | U64 | false | 2^20 | 0.544 | 17.524 us | 1.86% | 17.263 us | 2.16% | -0.261 us | -1.49% | PASS | | F64 | U64 | false | 2^24 | 0.544 | 109.922 us | 1.08% | 111.894 us | 1.40% | 1.972 us | 1.79% | FAIL | | F64 | U64 | false | 2^28 | 0.544 | 1.541 ms | 0.59% | 1.553 ms | 0.60% | 12.499 us | 0.81% | FAIL | | F64 | U64 | false | 2^16 | 0 | 9.824 us | 2.12% | 10.157 us | 2.57% | 0.334 us | 3.40% | FAIL | | F64 | U64 | false | 2^20 | 0 | 17.060 us | 2.07% | 17.199 us | 2.11% | 0.139 us | 0.81% | PASS | | F64 | U64 | false | 2^24 | 0 | 102.514 us | 0.95% | 104.150 us | 1.15% | 1.636 us | 1.60% | FAIL | | F64 | U64 | false | 2^28 | 0 | 1.300 ms | 0.76% | 1.318 ms | 0.74% | 18.096 us | 1.39% | FAIL |