I made some optimizations recently (mostly better data marshalling from C++ back to Python) and observed a considerable speedup. I want to re-run the benchmarks to see how our implementations compare and if I need to squeeze out more performance.
2020-07-31T21:40:45-04:00
Running ./benchmark-astar
Run on (8 X 2900 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 6144 KiB (x1)
Load Average: 0.31, 0.68, 0.63
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
---------------------------------------------------------------
Benchmark Time CPU Iterations
---------------------------------------------------------------
BM_AStar_Big 168256142 ns 166915840 ns 4
BM_AStar_Smooth_1000 121107850 ns 120236169 ns 5
BM_AStar_Small 7450731 ns 7423740 ns 92
Does that seem correct? I haven't used Google Benchmark before. My computer is pretty old and I'm not sure which library the "DEBUG" message refers to because I did specify a release build.
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.8.2, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/hendrik/work/pyastar
plugins: benchmark-3.2.3
collected 3 items
benchmarks.py ... [100%]
----------------------------------------------------------------------------------- benchmark: 3 tests -----------------------------------------------------------------------------------
Name (time in ms) Min Max Mean StdDev Median IQR Outliers OPS Rounds Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_small 7.6008 (1.0) 9.6488 (1.0) 7.9288 (1.0) 0.2942 (1.0) 7.8666 (1.0) 0.3759 (1.0) 15;3 126.1225 (1.0) 122 1
test_smooth_1000 74.0145 (9.74) 75.8684 (7.86) 74.6225 (9.41) 0.5267 (1.79) 74.5610 (9.48) 0.7518 (2.00) 4;0 13.4008 (0.11) 14 1
test_large 165.6195 (21.79) 170.2154 (17.64) 166.9610 (21.06) 1.7071 (5.80) 166.2985 (21.14) 1.4694 (3.91) 1;1 5.9894 (0.05) 6 1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Legend:
Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
OPS: Operations Per Second, computed as 1 / Mean
============================================================================================ 3 passed in 4.75s ============================================================================================
So taking the "Time" column from Google Benchmark and the "Mean" column from pytest-benchmark and putting it all together in milliseconds:
Input
astar-gridmap-2d
pyastar
maze_250.pgm
7.45
7.93
maze_1000_smooth.pgm
121.11
74.62
maze_large.pgm
168.26
166.96
I'm curious to hear how you did the benchmarks and if you have time to rerun your benchmarks, I'd love to see the comparison with my new implementation.
These numbers are close enough that I'm not willing to make any claims either way :)
Nice job beating my implementation! Congratulations :)
I made some optimizations recently (mostly better data marshalling from C++ back to Python) and observed a considerable speedup. I want to re-run the benchmarks to see how our implementations compare and if I need to squeeze out more performance.
I was able to build your code with
I then ran the benchmarks:
Does that seem correct? I haven't used Google Benchmark before. My computer is pretty old and I'm not sure which library the "DEBUG" message refers to because I did specify a release build.
I also benchmarked my own using
pytest-benchmark
using this benchmarking code. All times are in milliseconds:pytest-benchmark
and putting it all together in milliseconds:I'm curious to hear how you did the benchmarks and if you have time to rerun your benchmarks, I'd love to see the comparison with my new implementation.
These numbers are close enough that I'm not willing to make any claims either way :)