mockingbirdnest / Principia

𝑛-Body and Extended Body Gravitation for Kerbal Space Program
MIT License
746 stars 67 forks source link

Print the number of CPU cycles #4021

Closed pleroy closed 3 weeks ago

pleroy commented 4 weeks ago

🍐

Run on (48 X 3793 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x24)
  L1 Instruction 32 KiB (x24)
  L2 Unified 512 KiB (x24)
  L3 Unified 32768 KiB (x4)
---------------------------------------------------------------------------------------------------------
Benchmark                                                               Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------
BM_EvaluateElementaryFunction<Metric::Latency, std::sin>             7.27 ns         7.15 ns     89600000 cycles: 27.5278
BM_EvaluateElementaryFunction<Metric::Throughput, std::sin>          2.12 ns         2.13 ns    344616000 cycles: 8.00057
BM_EvaluateElementaryFunction<Metric::Latency, cr_sin>               31.9 ns         32.2 ns     21334000 cycles: 121.119
BM_EvaluateElementaryFunction<Metric::Throughput, cr_sin>            21.0 ns         21.0 ns     32000000 cycles: 79.4574
BM_EvaluateElementaryFunction<Metric::Latency, std::cos>             8.14 ns         8.02 ns     89600000 cycles: 30.8464
BM_EvaluateElementaryFunction<Metric::Throughput, std::cos>          2.31 ns         2.30 ns    298667000 cycles: 8.74865
BM_EvaluateElementaryFunction<Metric::Latency, cr_cos>               34.8 ns         34.5 ns     20364000 cycles: 132.109
BM_EvaluateElementaryFunction<Metric::Throughput, cr_cos>            21.2 ns         21.0 ns     32000000 cycles: 80.5334
BM_ExperimentSinTableSpacing<Metric::Latency, 2.0 / 256.0>           11.6 ns         11.7 ns     64000000 cycles: 44.1476
BM_ExperimentSinTableSpacing<Metric::Throughput, 2.0 / 256.0>        2.30 ns         2.29 ns    320000000 cycles: 8.67576
BM_ExperimentSinTableSpacing<Metric::Latency, 2.0 / 1024.0>          11.3 ns         11.5 ns     64000000 cycles: 42.6922
BM_ExperimentSinTableSpacing<Metric::Throughput, 2.0 / 1024.0>       2.06 ns         2.09 ns    344616000 cycles: 7.76445
BM_ExperimentCosTableSpacing<Metric::Latency, 2.0 / 256.0>           11.7 ns         11.7 ns     56000000 cycles: 44.3772
BM_ExperimentCosTableSpacing<Metric::Throughput, 2.0 / 256.0>        2.28 ns         2.29 ns    320000000 cycles: 8.62256
BM_ExperimentCosTableSpacing<Metric::Latency, 2.0 / 1024.0>          11.4 ns         11.5 ns     64000000 cycles: 43.0621
BM_ExperimentCosTableSpacing<Metric::Throughput, 2.0 / 1024.0>       2.04 ns         2.04 ns    344616000 cycles: 7.70125
BM_ExperimentSinMultiTable<Metric::Latency>                          12.0 ns         12.0 ns     56000000 cycles: 45.5629
BM_ExperimentSinMultiTable<Metric::Throughput>                       3.38 ns         3.38 ns    203637000 cycles: 12.7745
BM_ExperimentCosMultiTable<Metric::Latency>                          12.0 ns         12.2 ns     64000000 cycles: 45.5196
BM_ExperimentCosMultiTable<Metric::Throughput>                       3.37 ns         3.38 ns    203637000 cycles: 12.7572
BM_ExperimentSinSingleTable<Metric::Latency>                         11.3 ns         11.2 ns     56000000 cycles: 42.9243
BM_ExperimentSinSingleTable<Metric::Throughput>                      2.62 ns         2.61 ns    263530000 cycles: 9.90206
BM_ExperimentCosSingleTable<Metric::Latency>                         11.6 ns         11.7 ns     64000000 cycles: 43.9546
BM_ExperimentCosSingleTable<Metric::Throughput>                      2.08 ns         2.05 ns    320000000 cycles: 7.86435
BM_ExperimentSinNearZero<Metric::Latency>                            11.4 ns         11.2 ns     56000000 cycles: 43.1591
BM_ExperimentSinNearZero<Metric::Throughput>                         2.76 ns         2.76 ns    248889000 cycles: 10.4455
BM_ExperimentCosNearZero<Metric::Latency>                            11.6 ns         11.7 ns     64000000 cycles: 43.9638
BM_ExperimentCosNearZero<Metric::Throughput>                         2.09 ns         2.10 ns    320000000 cycles: 7.90099
BM_ExperimentSinFMA<Metric::Latency>                                 11.4 ns         11.4 ns     56000000 cycles: 43.1721
BM_ExperimentSinFMA<Metric::Throughput>                              3.33 ns         3.37 ns    213334000 cycles: 12.5807
BM_ExperimentCosFMA<Metric::Latency>                                 11.6 ns         11.7 ns     56000000 cycles: 44.1137
BM_ExperimentCosFMA<Metric::Throughput>                              2.34 ns         2.35 ns    298667000 cycles: 8.8596

1760.