ocaml-multicore / kcas

Software Transactional Memory for OCaml
https://ocaml-multicore.github.io/kcas/
ISC License
109 stars 11 forks source link

Verify is only needed when a CMP is followed by a CAS #79

Closed polytypic closed 1 year ago

polytypic commented 1 year ago

The pass to verify read-only CMP operations is only necessary when a CMP is followed by a CAS. In all other cases it can be skipped as the CMPs are already verified as part of the determine pass.

polytypic commented 1 year ago

As expected this seems to give a small improvement in cases where the are no CMP operations followed by a CAS. In other cases the added instructions seem to be essentially free.

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90
  Time (mean ± σ):      38.3 ms ±   0.2 ms    [User: 37.2 ms, System: 0.8 ms]
  Range (min … max):    38.1 ms …  39.0 ms    76 runs

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90
  Time (mean ± σ):      37.5 ms ±   0.4 ms    [User: 36.4 ms, System: 0.8 ms]
  Range (min … max):    37.0 ms …  38.9 ms    79 runs

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90' ran
    1.02 ± 0.01 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 90'

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90
  Time (mean ± σ):      22.8 ms ±   0.1 ms    [User: 41.6 ms, System: 1.0 ms]
  Range (min … max):    22.6 ms …  23.4 ms    128 runs

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90
  Time (mean ± σ):      22.7 ms ±   0.4 ms    [User: 41.4 ms, System: 1.1 ms]
  Range (min … max):    22.4 ms …  26.3 ms    129 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90' ran
    1.00 ± 0.02 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 90'

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90
  Time (mean ± σ):      15.2 ms ±   1.2 ms    [User: 49.1 ms, System: 1.9 ms]
  Range (min … max):    14.6 ms …  24.8 ms    199 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90
  Time (mean ± σ):      14.9 ms ±   0.3 ms    [User: 48.1 ms, System: 1.9 ms]
  Range (min … max):    14.5 ms …  17.7 ms    198 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90' ran
    1.02 ± 0.08 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 90'

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10
  Time (mean ± σ):     225.3 ms ±   0.5 ms    [User: 223.2 ms, System: 1.7 ms]
  Range (min … max):   224.6 ms … 226.8 ms    13 runs

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10
  Time (mean ± σ):     216.1 ms ±   0.6 ms    [User: 214.1 ms, System: 1.6 ms]
  Range (min … max):   215.2 ms … 217.3 ms    13 runs

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10' ran
    1.04 ± 0.00 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 1 1_000_000 1000 10'

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10
  Time (mean ± σ):     115.5 ms ±   0.5 ms    [User: 225.6 ms, System: 1.8 ms]
  Range (min … max):   114.9 ms … 116.8 ms    26 runs

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10
  Time (mean ± σ):     114.4 ms ±   0.4 ms    [User: 223.8 ms, System: 1.4 ms]
  Range (min … max):   114.0 ms … 115.5 ms    26 runs

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10' ran
    1.01 ± 0.01 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 2 1_000_000 1000 10'

Benchmark 1: kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10
  Time (mean ± σ):      71.0 ms ±   0.4 ms    [User: 270.4 ms, System: 2.4 ms]
  Range (min … max):    70.3 ms …  72.3 ms    42 runs

Benchmark 2: kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10
  Time (mean ± σ):      70.6 ms ±   0.3 ms    [User: 268.7 ms, System: 2.5 ms]
  Range (min … max):    70.1 ms …  71.4 ms    42 runs

Summary
  'kcas-this/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10' ran
    1.01 ± 0.01 times faster than 'kcas-main/_build/default/test/kcas_data/hashtbl_bench.exe 4 1_000_000 1000 10'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_benchmark.exe 1 10000
  Time (mean ± σ):       6.9 ms ±   0.1 ms    [User: 6.0 ms, System: 0.7 ms]
  Range (min … max):     6.9 ms …   8.2 ms    423 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_benchmark.exe 1 10000
  Time (mean ± σ):       6.9 ms ±   0.1 ms    [User: 6.0 ms, System: 0.7 ms]
  Range (min … max):     6.9 ms …   8.2 ms    434 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas/xt_benchmark.exe 1 10000' ran
    1.00 ± 0.02 times faster than 'kcas-main/_build/default/test/kcas/xt_benchmark.exe 1 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_benchmark.exe 2 10000
  Time (mean ± σ):      13.2 ms ±   0.4 ms    [User: 12.2 ms, System: 0.7 ms]
  Range (min … max):    13.0 ms …  18.8 ms    226 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_benchmark.exe 2 10000
  Time (mean ± σ):      13.3 ms ±   0.1 ms    [User: 12.3 ms, System: 0.7 ms]
  Range (min … max):    13.1 ms …  14.0 ms    227 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-main/_build/default/test/kcas/xt_benchmark.exe 2 10000' ran
    1.00 ± 0.03 times faster than 'kcas-this/_build/default/test/kcas/xt_benchmark.exe 2 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_benchmark.exe 4 10000
  Time (mean ± σ):      22.2 ms ±   0.1 ms    [User: 21.1 ms, System: 0.7 ms]
  Range (min … max):    22.0 ms …  22.8 ms    135 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_benchmark.exe 4 10000
  Time (mean ± σ):      22.2 ms ±   0.1 ms    [User: 21.2 ms, System: 0.7 ms]
  Range (min … max):    22.0 ms …  22.6 ms    134 runs

Summary
  'kcas-main/_build/default/test/kcas/xt_benchmark.exe 4 10000' ran
    1.00 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas/xt_benchmark.exe 4 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 100000
  Time (mean ± σ):      19.0 ms ±   1.5 ms    [User: 30.4 ms, System: 1.6 ms]
  Range (min … max):    17.6 ms …  21.7 ms    142 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (21.0 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 100000
  Time (mean ± σ):      18.7 ms ±   1.9 ms    [User: 29.7 ms, System: 1.7 ms]
  Range (min … max):    16.9 ms …  27.5 ms    173 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 100000' ran
    1.02 ± 0.13 times faster than 'kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 100000'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 200000
  Time (mean ± σ):      35.5 ms ±   3.0 ms    [User: 58.9 ms, System: 2.4 ms]
  Range (min … max):    33.1 ms …  43.2 ms    90 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 200000
  Time (mean ± σ):      34.2 ms ±   3.4 ms    [User: 56.7 ms, System: 2.3 ms]
  Range (min … max):    31.7 ms …  40.4 ms    92 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 200000' ran
    1.04 ± 0.14 times faster than 'kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 200000'

Benchmark 1: kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 400000
  Time (mean ± σ):      69.6 ms ±   6.5 ms    [User: 119.4 ms, System: 3.6 ms]
  Range (min … max):    63.7 ms …  78.2 ms    38 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (77.9 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You are already using the '--warmup' option which helps to fill these caches before the actual benchmark. You can either try to increase the warmup count further or re-run this benchmark on a quiet system in case it was a random outlier. Alternatively, consider using the '--prepare' option to clear the caches before each timing run.

Benchmark 2: kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 400000
  Time (mean ± σ):      67.8 ms ±   7.5 ms    [User: 115.5 ms, System: 3.5 ms]
  Range (min … max):    61.4 ms …  78.3 ms    48 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-this/_build/default/test/kcas/xt_parallel_cmp_bench.exe 400000' ran
    1.03 ± 0.15 times faster than 'kcas-main/_build/default/test/kcas/xt_parallel_cmp_bench.exe 400000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 1 10000
  Time (mean ± σ):       3.0 ms ±   0.0 ms    [User: 2.0 ms, System: 0.6 ms]
  Range (min … max):     2.9 ms …   3.2 ms    996 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 1 10000
  Time (mean ± σ):       3.0 ms ±   0.0 ms    [User: 2.1 ms, System: 0.6 ms]
  Range (min … max):     2.9 ms …   3.5 ms    919 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 1 10000' ran
    1.00 ± 0.02 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 1 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 2 10000
  Time (mean ± σ):       8.2 ms ±   0.1 ms    [User: 7.2 ms, System: 0.7 ms]
  Range (min … max):     8.1 ms …   8.4 ms    363 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 2 10000
  Time (mean ± σ):       8.2 ms ±   0.1 ms    [User: 7.3 ms, System: 0.7 ms]
  Range (min … max):     8.1 ms …   8.4 ms    362 runs

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 2 10000' ran
    1.00 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 2 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 4 10000
  Time (mean ± σ):      13.2 ms ±   0.1 ms    [User: 12.2 ms, System: 0.7 ms]
  Range (min … max):    13.1 ms …  13.4 ms    228 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 4 10000
  Time (mean ± σ):      13.2 ms ±   0.0 ms    [User: 12.2 ms, System: 0.7 ms]
  Range (min … max):    13.1 ms …  13.5 ms    226 runs

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 4 10000' ran
    1.00 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 4 10000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 1 200000
  Time (mean ± σ):      23.1 ms ±   0.1 ms    [User: 22.1 ms, System: 0.7 ms]
  Range (min … max):    23.0 ms …  23.4 ms    129 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 1 200000
  Time (mean ± σ):      23.1 ms ±   0.1 ms    [User: 22.1 ms, System: 0.7 ms]
  Range (min … max):    23.0 ms …  23.5 ms    129 runs

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 1 200000' ran
    1.00 ± 0.00 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 1 200000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 2 200000
  Time (mean ± σ):     126.9 ms ±   0.5 ms    [User: 125.6 ms, System: 1.0 ms]
  Range (min … max):   126.2 ms … 127.7 ms    23 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 2 200000
  Time (mean ± σ):     127.4 ms ±   0.4 ms    [User: 126.1 ms, System: 0.9 ms]
  Range (min … max):   126.6 ms … 128.4 ms    23 runs

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 2 200000' ran
    1.00 ± 0.01 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 2 200000'

Benchmark 1: kcas-main/_build/default/test/kcas/benchmark.exe 4 200000
  Time (mean ± σ):     226.2 ms ±   0.3 ms    [User: 224.7 ms, System: 1.1 ms]
  Range (min … max):   225.6 ms … 226.8 ms    13 runs

Benchmark 2: kcas-this/_build/default/test/kcas/benchmark.exe 4 200000
  Time (mean ± σ):     226.7 ms ±   0.3 ms    [User: 225.2 ms, System: 1.1 ms]
  Range (min … max):   226.2 ms … 227.1 ms    13 runs

Summary
  'kcas-main/_build/default/test/kcas/benchmark.exe 4 200000' ran
    1.00 ± 0.00 times faster than 'kcas-this/_build/default/test/kcas/benchmark.exe 4 200000'